administrators

Private

Posts

  • RE: Macro Modulators in FX

    @DanH I briefly looked into it and the main reason why it doesn't work is that containers have no voice logic at all which also includes MIDI processing.

    HiseSnippet 1094.3oc6X0raaaDDdok1fJm1fjVihdjG5Am1f.8mUrQODYKKkJTIKASkz.zCoaVNTZgI2kc4R2XTzmg9DUfduWxiReCZ2kT+PVIDKIXzzVXdffyL6ry2N+sizPofBQQBIxZ2QWEBHqOD6bEWMo0DBii5dJx5d39jHEHsSYcxUgjnHvEYYU3YFFVkJhRd9imdBwmvovBVHzKDLJziEvTK3Nr42v786PbgQrfLqtdytTAukvWDqwSAbYTHgdAYLbFwrrcvnulDMAY8E3CpViV28ItUqdvQMnjFvQdDOuJ0bqWuwgGUo1Qj5G1.Je.x5NscYJgzQQTPjdSOQ3dkyDwOxSMvKXQrW6CFhJHGskSYiZMg46NblyIBgrvCW3pJj5p1C2m4xlyegK69IBrWnQVml0NuKHUYCfjUFHULERO.6PkrP0BIF7bWbWtNB5QzwlrPIcsHqeC2RnW.W83.xEPGolXtB62nb4GYqe8vuxKlSULA2VvOSnfA78e3t+ztk18m209uKxyakxLlQJ78A4JEaRGjuKE2mGG7ZP9H6KI9wv7EpO948o2Y87ozzSclEJ3c4L0fPXJcGguqwWY9d4H.ZpaS+0y6dJQQLAko7zqKDjJlANVmBWpKCRCQkvmBQWnDg5Bgkhe5LGgarOQkOcxTnMUf1ejKFZBT7Hl5prEhaPNV40OG69oXDi60YvTfo+xdNhqb8.qiD9gXfSmw76eC9sO0zIn6hFIM+kleK4RvSHCl5wlqdOXr1PY4LBBBE5NVzrMcbBDB0DFeb99K60rmPD1lSzGc2raxvIjHSNaDnxsM5scjHs6WKeA8hrB6NlKjPZcPV9shiThfYGfjtVNJHLI6v56vMp+X8i2WZdeTK8qZCtYnQ2b8U1YMyK+jjv+73rIQXySMKtdolWSg3Fg34Ig+G.wO.OjonSVcw+Nq.uHzMVwOZ4q79HbaOOfpV.vh3Nubauea8M+8RM+cwNLNjLfRhw+zDZaS0l8y.NHWtMT9YTd65NiR3ZOix.pRa9QRBOJTDkaicf.1HA23EVv74QfIA7bSzLWiCAQtRQ5tl5C4J0wgnhkIoEGGHh445dYUXsSay1OH+UMYV+JZTT78vLQqeNy+VuRMa60RyvnCKHzGZyuD70yMjfwOVOnfGI1WMia9D69BtHbhfyxc024fRxFOFjYw9JOPGqTjrWosWyyAefjMC9ya1Sm4QjZ+Drk9hMeD1UFu9LbJbsMUi1uWhanMXRnpa9jP17e82ucRnamD51Ig1lIgJ7+6IgNWDqzEw8I5l6uQ224r3.G8.HTPacNG70aN1ZGSMWJcYCc5DHb2Dh+T+LUXECs0TgUlI7eDaDPnRwqno+XdSA1GjvQet4I+EOkv8Mz1UPI+.+rwx.8zCuhRyuUKoX0sUwZaqh02VEOXaUrw1p3S1VEO75UzLb2ww598oklHT+gsS6zaM+ZFqBn+BLOcfAL
    

    Here are two LFOs with a slow frequency, one in the root container and one in the sine wave generator. If I play a note, the LFO in the sine wave restarts as it receives the note on but the one in the main container keeps cycling at the same position.

    Once we solve that, we can easily allow Matrix modulators and other envelopes to be added to the container (they will be set to monophonic automatically as there are no voices).

    posted in General Questions
  • RE: Mac FX failing pluginval.....

    @Lindon said in Mac FX failing pluginval.....:

    DLL?, If so then if its broken(corrupt) for one HCFX shouldn't it be broken for all of them?

    Yes, but the error you describe is not a deterministic programming logic error but a case of undefined behaviour - something in there is not clearing the buffers correctly and this may or may not result in trash values being in there which is what pluginval complained about.

    posted in General Questions
  • RE: Hise won't open on Windows 10

    What happens if you run the debug build from within VS?

    posted in General Questions
  • RE: Mac FX failing pluginval.....

    @Lindon what if you put the entire network in a frame container?

    posted in General Questions
  • RE: Different repaint methods explaination please...

    I think repaintImmediately() can be considered deprecated for so long that it can be safely removed. A project that old won't open anyway with current versions for a thousand other reasons...

    If I deprecate it I would have to make it so that it throws a compilation error at the script parsing stage, not only when it's actually called as this could go through unnoticed, but yeah I agree that this could be part of a nice cleanup session along with other things that are completely out of date

    posted in Scripting
  • RE: Ring Buffer design

    Here's another example:

    Wrapping is now done with a single bitmask, no floor or % calls.

    So your assumption is that if you know that the limit is a power of two you can skip the expensive modulo call (which is the same as a division) and resort to bit fiddling (a & (LIMIT - 1). This is theoretically true, but this example shows you that the compiler is as smart as you so there's no need to do this yourself

    I admit I'm using this too a lot in the HISE codebase and it's not per se "bad coding style" (as it still speeds up the debug builds), but you don't leave performance on the table by not doing so - in fact the index::wrap function just uses the modulo operator and since you can give it a compile time upper limit it will benefit from this optimization for power-of two buffer sizes too.

    So the code example calls the default modulo operator, but one time with a unknown variable and once with a guaranteed power of two. The first function call results in:

    idiv    r8d
    

    which is bad (idiv is division so slow). The second one results in:

    dec     ecx
    or      ecx, -32                            ; ffffffffffffffe0H
    inc     ecx
    

    which is more or less the bit fiddling formula above (I wouldn't try to understand what it does exactly, but a simple look at the instruction names hint at some very lightweight instructions).

    posted in C++ Development
  • RE: Ring Buffer design

    @griffinboy no worries, this is very advanced stuff, and with DSP it's always good to have performance in mind but I really recommend to spend some time with godbolt after watching a 20 minute introduction video to assembly (just make sure to pick a certain CPU instruction set for this as the syntax varies between CPU types - I'm pretty familiar with x64 assembly which I've studied extensively when writing the SNEX optimizer, but the NEON instruction set for Apple Silicon CPUs is basically gibberish for me).

    It helped me a lot getting rid of the anxiety of writing slow code - you can expect that the compilers are able to make these low-level type optimizations pretty well and seeing the actual instructions the C++ code spits out does confirm that in most cases without having to resort to profiling.

    posted in C++ Development
  • RE: Ring Buffer design

    @ustk most likely missing knowledge due to sparse documentation :)

    Efficiency wise it shouldn't make much of a difference, in the end the compiler will optimize this pretty well and all the index types in SNEX are templated so their interpolation can be derived on compile time.

    I haven't played around with fixed-point math at all so I can't say whether that's much faster than using floating points, but there are a few things in the optimization notes from @griffinboy that are redundant:

    • inline is basically worthless in C++ by now. Every compiler will inline small functions like this even with the lowest optimization settings.
    • there's already a juce::nextPowerOfTwo() which contains the same code without the if branch which seems unnecessary.
    • using std::fmaf for a simple task like a*b + c seems to complicate the code and make it slower. Here's the godbolt output for this C++ code that compares it vs. its naive implementation:
    #include <cmath>
    
    
    float raw(float a, float b, float c)
    {
        return a * b + c;
    }
    
    float wrapped(float a, float b, float c)
    {
        return std::fmaf(a, b, c);
    }
    
    int main()
    {
        return 0;
    }
    

    x64 assembly output for MSVC with -O3:

    float raw(float,float,float) PROC                           ; raw
            movss   DWORD PTR [rsp+24], xmm2
            movss   DWORD PTR [rsp+16], xmm1
            movss   DWORD PTR [rsp+8], xmm0
            movss   xmm0, DWORD PTR a$[rsp]
    >>      mulss   xmm0, DWORD PTR b$[rsp]
    >>      addss   xmm0, DWORD PTR c$[rsp]
            ret     0
    
    float wrapped(float,float,float) PROC                    
            movss   DWORD PTR [rsp+24], xmm2
            movss   DWORD PTR [rsp+16], xmm1
            movss   DWORD PTR [rsp+8], xmm0
            sub     rsp, 40                             ; 00000028H
            movss   xmm2, DWORD PTR c$[rsp]
            movss   xmm1, DWORD PTR b$[rsp]
            movss   xmm0, DWORD PTR a$[rsp]
    >>      call    QWORD PTR __imp_fmaf
            add     rsp, 40                             ; 00000028H
            ret     0
    

    You don't need to be able to fully understand or write assembly to extract valuable information out of that, the most obvious thing is more lines in the second function which means more time spent there. But it gets worse when you take a closer look: I've marked the relevant lines with >>. The first function boils down to two single instructions which are basically for free on a modern CPU while the wrapped function invokes a function call including copying the values into call registers etc. which is a order of magnitude slower than the first example.

    TLDR: Start with the implementation that is the easiest to understand / write / debug, then profile or put isolated pieces into Godbolt to see what can be improved.

    EDIT: I've messed up the compiler flags somehow (-O3 is a clang compiler flag, with MSVC it's -Ox, the fully optimized assembly looks like this:

    float raw(float,float,float) PROC                           ; raw
            mulss   xmm0, xmm1
            addss   xmm0, xmm2
            ret     0
    
    float wrapped(float,float,float) PROC                       ; wrapped
            jmp     fmaf
    

    so it calls the inbuilt CPU instruction directly which indeed seems quicker, but I would still expect that this is not making much of a difference.

    posted in C++ Development
  • RE: Different repaint methods explaination please...

    @ustk it‘s a bit faster but the main advantage is that you can use local variables which are more tightly scoped.

    posted in Scripting
  • RE: Different repaint methods explaination please...

    @d-healey actually repaint and repaintImmediately do the same thing now - this distinction was before I rewrote the threading model. Both will execute the code on the scripting thread (and if the call is made on the scripting thread it will be executed right after the current function call), then a list of paint directives is created and sent to the UI thread where it renders the interface).

    posted in Scripting