HISE Logo Forum
    • Categories
    • Register
    • Login

    Simple ML neural network

    Scheduled Pinned Locked Moved General Questions
    134 Posts 18 Posters 11.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ?
      A Former User @d.healey
      last edited by

      @d-healey i am excited to ignite everyone's CPUs 😀

      d.healeyD 1 Reply Last reply Reply Quote 1
      • d.healeyD
        d.healey @A Former User
        last edited by

        @iamlamprey I thought the ML stuff is only in scriptnode?

        Libre Wave - Freedom respecting instruments and effects
        My Patreon - HISE tutorials
        YouTube Channel - Public HISE tutorials

        ? 1 Reply Last reply Reply Quote 0
        • ?
          A Former User @d.healey
          last edited by

          @d-healey uncompiled networks work in Rhapsody, the snex and expr stuff doesn't though

          d.healeyD 1 Reply Last reply Reply Quote 0
          • d.healeyD
            d.healey @A Former User
            last edited by

            @iamlamprey Aha ok that's good to know

            Libre Wave - Freedom respecting instruments and effects
            My Patreon - HISE tutorials
            YouTube Channel - Public HISE tutorials

            1 Reply Last reply Reply Quote 0
            • ?
              A Former User
              last edited by A Former User

              This post is deleted!
              1 Reply Last reply Reply Quote 0
              • ?
                A Former User
                last edited by

                This post is deleted!
                1 Reply Last reply Reply Quote 0
                • LindonL
                  Lindon @Christoph Hart
                  last edited by

                  @Christoph-Hart said in Simple ML neural network:

                  @iamlamprey nope, I‘ve suspended my journey into ML until I have a real use case for it :)

                  If anyone is using this stuff I‘m happy to implement new features or fix issues, but now that the „hello world“ is implemented I expect it to grow with actual projects and their requirement.

                  Well speak of the devil and he will appear.... I now have a customer who would like to make Amp sims using Neural Net ML....

                  So I've spent some time with the research papers, the youTube Videos and the blogs. It would seem I might need a statetful LSTM model - but I could easily be wrong.

                  My understanding of how I(we) might approach this is very very poor, but I think there any number of ways to get to the json code that would be needed to "make it work", but wow do I have a bunch of questions.... here's some of them:

                  1. is this even possible in HISE? (I mean the playback not the modelling or learning)
                  2. Can I "just" use (say) this stuff : https://www.youtube.com/watch?v=xkrqF0D8pfQn and take the .json output and load it into RTNeural and expect it to "sort of" work?
                  3. If not what's the best way to get from a bunch of guitar recordings to a set of .json files that I can then load into math.neural?
                  4. I assume all these Learned models are in fact static snapshots of dynamic processes, so every time the user changes the distortion control on the plugin I will need to load another model - if so how practical is that?

                  HISE Development for hire.
                  www.channelrobot.com

                  ? 1 Reply Last reply Reply Quote 3
                  • ?
                    A Former User @Lindon
                    last edited by

                    @Lindon said in Simple ML neural network:

                    So I've spent some time with the research papers, the youTube Videos and the blogs. It would seem I might need a statetful LSTM model - but I could easily be wrong.

                    Yeh any recurrent network is fine for a guitar amp, Jatin prefers GRU's but all of the guitarML stuff is LSTM if i remember correctly, i think GRUs are a bit easier on the CPU

                    My understanding of how I(we) might approach this is very very poor, but I think there any number of ways to get to the json code that would be needed to "make it work", but wow do I have a bunch of questions.... here's some of them:

                    1. is this even possible in HISE? (I mean the playback not the modelling or learning)

                    if the entire RTNeural framework is implemented, recurrent models should be supported

                    1. Can I "just" use (say) this stuff : https://www.youtube.com/watch?v=xkrqF0D8pfQn and take the .json output and load it into RTNeural and expect it to "sort of" work?

                    if he's just using regular old pytorch and building simple models, you should be able to load the state_dict and run the export script, then bring the JSON into HISE

                    1. If not what's the best way to get from a bunch of guitar recordings to a set of .json files that I can then load into math.neural?

                    this is where training a model comes in, you'd need to learn some basic Python and familiarize yourself with packages like Torch or Tensorflow, as well as some audio-handling ones like Librosa

                    the short version is:

                    • load and preprocess/augment the audio data
                    • convert it into a dataset that can be read by a model
                    • create the model & train it on the data
                    • export the model's state_dict using the RTNeural export script
                    • load it into a Math.Neural node
                    1. I assume all these Learned models are in fact static snapshots of dynamic processes, so every time the user changes the distortion control on the plugin I will need to load another model - if so how practical is that?

                    recurrent models aren't really "static", they can only estimate a single function at a given time, but that function can change depending on the "state" of the network, it basically has a short memory (literally in the name of LSTMs) so the function it's estimating is dependant on that memory

                    that being said, for a guitar amp the typical process is to train several models on various settings of the amp, then crossfade between them

                    LindonL 1 Reply Last reply Reply Quote 0
                    • LindonL
                      Lindon @A Former User
                      last edited by

                      @iamlamprey said in Simple ML neural network:

                      Ok so first: thank you - this is very helpful, but of course it just leads to more questions - and Im slightly afraid this might turn into some sort of ML primer at your time expense, so - I will try and keep to a minimum...

                      @Lindon said in Simple ML neural network:

                      So I've spent some time with the research papers, the youTube Videos and the blogs. It would seem I might need a statetful LSTM model - but I could easily be wrong.

                      Yeh any recurrent network is fine for a guitar amp, Jatin prefers GRU's but all of the guitarML stuff is LSTM if i remember correctly, i think GRUs are a bit easier on the CPU

                      Ok so fine GRU or plain old LSTM - is the "kind of" model yes?

                      My understanding of how I(we) might approach this is very very poor, but I think there any number of ways to get to the json code that would be needed to "make it work", but wow do I have a bunch of questions.... here's some of them:

                      1. is this even possible in HISE? (I mean the playback not the modelling or learning)

                      if the entire RTNeural framework is implemented, recurrent models should be supported

                      here's one for @Christoph-Hart then: is GRU or LTSM implemented in the HIse implementation currently?

                      1. Can I "just" use (say) this stuff : https://www.youtube.com/watch?v=xkrqF0D8pfQn and take the .json output and load it into RTNeural and expect it to "sort of" work?

                      if he's just using regular old pytorch and building simple models, you should be able to load the state_dict and run the export script, then bring the JSON into HISE

                      Well of course I have no idea what he's doing - but interestingly what do you mean by "simple model"?

                      1. If not what's the best way to get from a bunch of guitar recordings to a set of .json files that I can then load into math.neural?

                      this is where training a model comes in, you'd need to learn some basic Python and familiarize yourself with packages like Torch or Tensorflow, as well as some audio-handling ones like Librosa

                      OK well unlike Christoph I once upon a time did a fair bit of Python work _ I really like it as a language...

                      the short version is:

                      • load and preprocess/augment the audio data

                      So here I think (correct me if Im wrong) this means ; playing about 5 mins of guitar thru the amp - and capturing the output as well as capturing the DI-ed guitar signal - then making sure these are edited nicely to phase align with each other. Is that about it?

                      • convert it into a dataset that can be read by a model

                      making it a pickle data set? Or perhaps NumPy

                      • create the model & train it on the data
                      • ha here we get to the bit Im most fuzzy on...-- create the model? So is this just set up a big set of neural net nodes which I can build with something like TensorFlow and Keras yes? no? maybe?

                      As an aside there seems to be a lot of parms I can set up for this - I guess more research..

                      • export the model's state_dict using the RTNeural export script

                      I get a .json file yes?

                      • load it into a Math.Neural node

                      Which I load into here...

                      1. I assume all these Learned models are in fact static snapshots of dynamic processes, so every time the user changes the distortion control on the plugin I will need to load another model - if so how practical is that?

                      recurrent models aren't really "static", they can only estimate a single function at a given time, but that function can change depending on the "state" of the network, it basically has a short memory (literally in the name of LSTMs) so the function it's estimating is dependant on that memory

                      that being said, for a guitar amp the typical process is to train several models on various settings of the amp, then crossfade between them

                      Ok so I need to load up these .json "models" dynamically as the user fiddles with the UI controls...

                      HISE Development for hire.
                      www.channelrobot.com

                      d.healeyD ? 2 Replies Last reply Reply Quote 0
                      • d.healeyD
                        d.healey @Lindon
                        last edited by

                        @Lindon said in Simple ML neural network:

                        playing about 5 mins of guitar thru the amp - and capturing the output as well as capturing the DI-ed guitar signal

                        Is the NN result noticeably better than using an IR derived from the same recordings?

                        Libre Wave - Freedom respecting instruments and effects
                        My Patreon - HISE tutorials
                        YouTube Channel - Public HISE tutorials

                        LindonL 1 Reply Last reply Reply Quote 0
                        • ?
                          A Former User @Lindon
                          last edited by

                          @Lindon said in Simple ML neural network:

                          Ok so fine GRU or plain old LSTM - is the "kind of" model yes?

                          for a bit more nuance than a fully connected network for something like analog modelling, yep.

                          Well of course I have no idea what he's doing - but interestingly what do you mean by "simple model"?

                          by "simple model" i mean one that only has to estimate a function. other types like generative models (VAEs, GANs), or ones that make use of embedding layers might be outside the scope of the current implementation

                          So here I think (correct me if Im wrong) this means ; playing about 5 mins of guitar thru the amp - and capturing the output as well as capturing the DI-ed guitar signal - then making sure these are edited nicely to phase align with each other. Is that about it?

                          pretty much, you'd then cut those 5 minutes up into small pieces so you dont have 5 x 60 x 44100 samples being fed into the model at once

                          making it a pickle data set? Or perhaps NumPy

                          torch & tensorflow both have built-in tools for converting raw data into a readable format for the model, including batching tools which speed up training on a GPU

                          ha here we get to the bit Im most fuzzy on...-- create the model? So is this just set up a big set of neural net nodes which I can build with something like TensorFlow and Keras yes? no? maybe?

                          yep, you basically tell keras/tensorflow/torch which layers, activations (non-linear functions) and input/output shapes you want to ensure it can spit out the data you're expecting it to

                          sidenote: tensor shape errors are one of the worst parts of machine learning, prepare to spend a decent amount of time looking at these

                          I get a .json file yes?

                          yep, the export script in the repo spits out a json file to load into HISE

                          Which I load into here...

                          sorry, this was my mistake. you don't directly load the JSON into the Neural Node, you load it with

                          const nn = Engine.createNeuralNetwork("networkname");
                          
                          nn.loadPytorchModel(myJSONobject);
                          
                          

                          then when you add a Math.neural node, "nn" should appear in the dropdown

                          Ok so I need to load up these .json "models" dynamically as the user fiddles with the UI controls...

                          i don't think hot-swapping the actual networks on the fly would be good, rather use multiple lightweight networks and fade between them with a Gain node or something

                          hope this helps, it's a big subject

                          1 Reply Last reply Reply Quote 1
                          • LindonL
                            Lindon @d.healey
                            last edited by Lindon

                            @d-healey said in Simple ML neural network:

                            @Lindon said in Simple ML neural network:

                            playing about 5 mins of guitar thru the amp - and capturing the output as well as capturing the DI-ed guitar signal

                            Is the NN result noticeably better than using an IR derived from the same recordings?

                            very good question..... anyone got an opinion? Im guessing the NN acts "more dynamically" compared to the IR....

                            HISE Development for hire.
                            www.channelrobot.com

                            ? 1 Reply Last reply Reply Quote 0
                            • ?
                              A Former User @Lindon
                              last edited by

                              @Lindon id say they can be about as accurate as a decent whitebox model, 8.5-9.0/10

                              d.healeyD LindonL 2 Replies Last reply Reply Quote 0
                              • d.healeyD
                                d.healey @A Former User
                                last edited by d.healey

                                @iamlamprey In that case I'd go with the IR which takes a few seconds to make - I have a script that can process 1000s of files to create loads of IRs in one go.

                                @iamlamprey said in Simple ML neural network:

                                rather use multiple lightweight networks and fade between them with a Gain node or something

                                Can do the same with multiple IRs

                                Libre Wave - Freedom respecting instruments and effects
                                My Patreon - HISE tutorials
                                YouTube Channel - Public HISE tutorials

                                1 Reply Last reply Reply Quote 0
                                • LindonL
                                  Lindon @A Former User
                                  last edited by Lindon

                                  @iamlamprey said in Simple ML neural network:

                                  @Lindon id say they can be about as accurate as a decent whitebox model, 8.5-9.0/10

                                  so to get this clear in my head - the ML version is no better at the dynamic processing of guitar input that a set of IRs is?

                                  ..and by "dynamic" I mean different volume levels of input as the instrument is played...quietly (for those AOR ballads) or loudly...(thrashing it)

                                  HISE Development for hire.
                                  www.channelrobot.com

                                  ? 1 Reply Last reply Reply Quote 0
                                  • ?
                                    A Former User @Lindon
                                    last edited by

                                    @Lindon it really depends on the implementation of the model and the data, having a robust model trained on a lot of varying data (different amp settings, different players etc) will sound great, and probably beat IRs and blackbox models

                                    that being said, a dedicated whitebox model recreating each component of the signal path is still probably the best method for physical accuracy, i don't think a test comparing a proper neural net vs a proper physical model has ever been done, for obvious reasons ($$$)

                                    there's still a lot of progress to be made with neural audio, so who knows how things will look in a year or two

                                    LindonL 1 Reply Last reply Reply Quote 2
                                    • LindonL
                                      Lindon @A Former User
                                      last edited by

                                      @iamlamprey said in Simple ML neural network:

                                      @Lindon it really depends on the implementation of the model and the data, having a robust model trained on a lot of varying data (different amp settings, different players etc) will sound great, and probably beat IRs and blackbox models

                                      that being said, a dedicated whitebox model recreating each component of the signal path is still probably the best method for physical accuracy, i don't think a test comparing a proper neural net vs a proper physical model has ever been done, for obvious reasons ($$$)

                                      there's still a lot of progress to be made with neural audio, so who knows how things will look in a year or two

                                      Yes you are probably right, I was however thinking the amount of effort to build a neural net based amp sim would possibly greater than that to build a "nebula like" , "dynamic" IR model...

                                      HISE Development for hire.
                                      www.channelrobot.com

                                      LindonL 1 Reply Last reply Reply Quote 0
                                      • LindonL
                                        Lindon @Lindon
                                        last edited by

                                        @iamlamprey - so does a "well formed/well trained" neural net model perform with the non-linearities of a real amp or not? I guess I'm back at "is ML even worth it right now?" as a question.

                                        HISE Development for hire.
                                        www.channelrobot.com

                                        ? 1 Reply Last reply Reply Quote 0
                                        • ?
                                          A Former User @Lindon
                                          last edited by

                                          @Lindon here's an example of an effective implementation of NN:

                                          Modelling Black-box Audio Effects with Time-varying Feature Modulation

                                          Deep learning approaches for black-box modelling of audio effects have shown promise, however, the majority of existing work focuses on nonlinear effects with behaviour on relatively short time-scales, such as guitar amplifiers and distortion. While recurrent and convolutional architectures can theoretically be extended to capture behaviour at longer time scales, we show that simply scaling the width, depth, or dilation factor of existing architectures does not result in satisfactory performance when modelling audio effects such as fuzz and dynamic range compression. To address this, we propose the integration of time-varying feature-wise linear modulation into existing temporal convolutional backbones, an approach that enables learnable adaptation of the intermediate activations. We demonstrate that our approach more accurately captures long-range dependencies for a range of fuzz and compressor implementations across both time and frequency domain metrics. We provide sound examples, source code, and pretrained models to faciliate reproducibility

                                          favicon

                                          (mcomunita.github.io)

                                          Christian's repos are also public (or were last time i checked), they're more complex models compared to a regular LSTM, often utilizing multi-resolution STFT loss & other advanced things like dilated convolutions

                                          I believe RTNeural has convolution dilation already implemented, so something like this is possible, albeit probably difficult

                                          LindonL 1 Reply Last reply Reply Quote 0
                                          • LindonL
                                            Lindon @A Former User
                                            last edited by

                                            @iamlamprey said in Simple ML neural network:

                                            @Lindon here's an example of an effective implementation of NN:

                                            Modelling Black-box Audio Effects with Time-varying Feature Modulation

                                            Deep learning approaches for black-box modelling of audio effects have shown promise, however, the majority of existing work focuses on nonlinear effects with behaviour on relatively short time-scales, such as guitar amplifiers and distortion. While recurrent and convolutional architectures can theoretically be extended to capture behaviour at longer time scales, we show that simply scaling the width, depth, or dilation factor of existing architectures does not result in satisfactory performance when modelling audio effects such as fuzz and dynamic range compression. To address this, we propose the integration of time-varying feature-wise linear modulation into existing temporal convolutional backbones, an approach that enables learnable adaptation of the intermediate activations. We demonstrate that our approach more accurately captures long-range dependencies for a range of fuzz and compressor implementations across both time and frequency domain metrics. We provide sound examples, source code, and pretrained models to faciliate reproducibility

                                            favicon

                                            (mcomunita.github.io)

                                            Christian's repos are also public (or were last time i checked), they're more complex models compared to a regular LSTM, often utilizing multi-resolution STFT loss & other advanced things like dilated convolutions

                                            I believe RTNeural has convolution dilation already implemented, so something like this is possible, albeit probably difficult

                                            Thanks -listening to the examples was interesting. Its clear the LSTM approach has some weaknesses at low gain inputs...but the approach outlined in the paper seems to deal with these much better...but as you say its probably considerably more difficult to achieve -especially as I have no idea what STFT is or dilated convolutions or how to build a model using them.

                                            Hmm, this is getting more complex the more I look at it.

                                            HISE Development for hire.
                                            www.channelrobot.com

                                            ? 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post

                                            28

                                            Online

                                            1.7k

                                            Users

                                            11.7k

                                            Topics

                                            102.3k

                                            Posts