Simple ML neural network
-
@hisefilo https://github.com/jatinchowdhury18/RTNeural
here's the old node I was using, im not sure if it even works since i haven't touched it in a year but it has the basic pipeline implemented, i've added some comments to help explain what's going on , I recommend starting with a simple LSTM like the Guitar Amp from this:
https://towardsdatascience.com/neural-networks-for-real-time-audio-stateless-lstm-97ecd1e590b8
the Node is called "tensorflow Node" or something but it's not using tensorflow at all
// =========================| Third Party Node Template |========================= #pragma once #include <JuceHeader.h> #include <RTNeural.h> #include "src/model_weights.h" //RTNEURAL_DEFAULT_ALIGNMENT = 16 //RTNEURAL_USE_EIGEN = 1 namespace project { using namespace juce; using namespace hise; using namespace scriptnode; // =================| The node class with all required callbacks |================= template <int NV> struct tensorflow_node : public data::base { // Metadata Definitions ------------------------------------------------------ SNEX_NODE(tensorflow_node); struct MetadataClass { SN_NODE_ID("tensorflow_node"); }; // set to true if you want this node to have a modulation dragger static constexpr bool isModNode() { return false; }; static constexpr bool isPolyphonic() { return NV > 1; }; // set to true if your node produces a tail static constexpr bool hasTail() { return false; }; // Undefine this method if you want a dynamic channel count static constexpr int getFixChannelAmount() { return 2; }; // Define the amount and types of external data slots you want to use static constexpr int NumTables = 0; static constexpr int NumSliderPacks = 0; static constexpr int NumAudioFiles = 0; static constexpr int NumFilters = 0; static constexpr int NumDisplayBuffers = 0; //RTNeural Model float rtNeuralInput[1] = { 0.0 }; // Need a single-sample array to pass to the model as a "Tensor" int modelType = 0; // can't use Strings float predictedNextSample = 0.0; // initialize our "predicted" sample // let's create a few Models // input layer goes first // then "hidden" layers (LSTMLayerT, GRULayerT, DenseT, Conv1DT) // syntax is LayerType<float, inSize, outSize> // make sure the sizes match between connected layers // finally output layer (usually a Dense layer with a single output) //LSTM Example // LSTMs build a learned state from previous samples to predict the next one RTNeural::ModelT<float, 1, 1, RTNeural::LSTMLayerT<float, 1, 20>, RTNeural::DenseT<float, 20, 1> > modelLSTM; //GRU Example // grus do a similar thing to LSTMs, Jatin prefers them RTNeural::ModelT<double, 1, 1, RTNeural::GRULayerT<double, 1, 20>, RTNeural::DenseT<double, 20, 1> > modelGRU; //TCN Example // TCN's are great all-rounders, with dilated convolutions they can generate a large receptive field // the syntax for Conv layers is <double, inSize, outSize, kernel, dilation> RTNeural::ModelT<double, 1, 1, RTNeural::Conv1DT<double, 1, 1, 16, 1>, RTNeural::Conv1DT<double, 1, 1, 16, 2>, RTNeural::Conv1DT<double, 1, 1, 16, 4>, RTNeural::Conv1DT<double, 1, 1, 16, 8>, RTNeural::DenseT<double, 1, 1> > modelTCN; //Weights void set_weights() { // the weights are imported from the src/model_weights.h file in the #includes // we use Model.get<index>() to get the layer //TCN Example modelTCN.get<0>().setWeights(conv1Weights); modelTCN.get<1>().setWeights(conv2Weights); modelTCN.get<2>().setWeights(conv3Weights); modelTCN.get<3>().setWeights(conv4Weights); modelTCN.get<0>().setBias(conv1Bias); modelTCN.get<1>().setBias(conv2Bias); modelTCN.get<2>().setBias(conv3Bias); modelTCN.get<3>().setBias(conv4Bias); modelTCN.get<4>().setWeights(denseWeights); //LSTM Example modelLSTM.get<0>().setWVals(LSTMWVals); modelLSTM.get<0>().setUVals(LSTMUVals); modelLSTM.get<0>().setBVals(LSTMBVals); modelLSTM.get<1>().setWeights(LSTMDenseWeight); modelLSTM.get<1>().setBias(LSTMDenseBias); } // Scriptnode Callbacks ------------------------------------------------------ void prepare(PrepareSpecs specs) { // reset the model here model.reset(); modelLSTM.reset(); modelGRU.reset(); set_weights(); // instantiate weights } void handleHiseEvent(HiseEvent& e) {} void reset() { // reset everything again modelLSTM.reset(); predictedNextSample = 0.0; } template <typename T> void process(T& data) { static constexpr int NumChannels = getFixChannelAmount(); // Cast the dynamic channel data to a fixed channel amount auto& fixData = data.template as<ProcessData<NumChannels>>(); int numSamples = data.getNumSamples(); for (auto ch : data) // for each channel { dyn<float> channel = data.toChannelData(ch); for (auto& sample : channel) //For each sample { if (std::isnan(sample)) //Check if NaN { // this checks if we don't already have a learned state (we wont when hitting Play) sample = predictedNextSample; //Use previous prediction as input if NaN } //Now call new prediction... //the model has an "internal state" or memory //which it learns based on previous inputs sample = modelLSTM.forward(&sample); } } // Create a FrameProcessor object auto fd = fixData.toFrameData(); while (fd.next()) { // Forward to frame processing processFrame(fd.toSpan()); } } template <typename T> void processFrame(T& data){} int handleModulation(double& value) { return 0; } void setExternalData(const ExternalData& data, int index) {} // Parameter Functions ------------------------------------------------------- template <int P> void setParameter(double v){} void createParameters(ParameterDataList& data){} }; }
the weights looksomething like this (prepare your eyes):
std::vector<std::vector<std::vector<double>>> conv1Weights = { { { { -0.014952817000448704, 0.05129363760352135, -0.10707060992717743, -0.011642195284366608, 0.05948321148753166, 0.11834733933210373, -0.03627893701195717, 0.08669514209032059, 0.022034088149666786, -0.03608795627951622, 0.04046545550227165, 0.03425503522157669, -0.0263582244515419, -0.06742122769355774, -0.13450580835342407, 0.12268577516078949 } }, { { -0.0675336942076683, 0.03803471103310585, 0.06178430840373039, 0.029312260448932648, -0.04388665407896042, 0.05684272199869156, -0.057864125818014145, -0.11442451924085617, -0.005465891677886248, 0.045379918068647385, -0.10136598348617554, -0.013837937265634537, -0.01725628226995468, -0.02865828201174736, -0.009762179106473923, -0.06300396472215652 } }, // continues for many pages
-
@iamlamprey @d-healey thanks mates!!!! I guess now I have plans for the weekend!!!!!
-
@hisefilo Just FYI, I've started working on adding a neural network API to HISE. It will take a while so don't expect immediate results, but my plans are the following:
- add RTNeural as a "network player" that can run trained models. This will be available in the compiled plugin as well as in HISE. The use cases for this will be TBD (and I'm open to suggestions for possible applications), but I can imagine there will be both a scripting API for processing input data as well as a scriptnode node that will run the network on either the audio signal or cable level (a bit like
cable_expr
andcore.expr
). The advantage of this library is that it's fairly lightweight plus it has an emphasis on realtime performance (although there will be use cases like preset creation, etc). - Add a binding to one of the big ML libraries (the current favorites are either PyTorch or mlpack) to HISE in order to create and train neural networks. I know that the "industry standard" answer to this task is "use Python", however the integration of these libraries into HISE will yield a few advantages, plus I need to learn the API anyways so this will be a drive-by effect of the integration.
With the availability of training models within HISE we'll get these advantages over having to resort to Python:
- we need to convert the model data to be loaded into RTNeural anyways
- we can create training data from within HISE using a script that feeds into the training process
- no need to learn Python and it's weird whitespace syntax lol. If the tools are available without having to setup the entire Python ML toolchain, it might be used for mor simple tasks too.
I think in order to make this as non-bullshitty as possible (I have no interest in saying "HISE can now do AI too!"), we should talk about which use cases for neural networks occur in the development of HISE projects and then talk about the requirements.
From the top of my head there are a few applications:
- sound classification. Drop in a sample, the network categorizes it in whatever you want and processes this information => train a network with samples, then run it with user supplied samples to get the information out of it.
- preset creation. Drop in a sample, the network will analyse it and create preset data (a bit like this homie is doing it. => train a network with spectrograms of randomly created presets, then run the inverted process when the user drops a sample. A very powerful boost for this functionality could be the Loris library, I can imagine that having an array of highly precise time-varying gain values associated to the harmonic index is a much better input data than these few pixels from a spectrogram
- use a neural network to perform audio calculations (from amp sim and other stuff to changing the instrument like the ddsp plugins do). I'm not very picky about distortion and amp simulation (a simple tanh does the job for me lol), but apparently that's one of the prime applications of RTNeural.
The scope of the planned neural network support is currently limited to anything that boils down to "float numbers in, float numbers out" and I'm not deep enough in the ML rabbithole yet to decide how big of a step it would be to add language support (so you can finally make SynthGPT with HISE), plus my current intuition is that there's a rather limited and hype-focused use case for NLP in regards to audio plugins.
- add RTNeural as a "network player" that can run trained models. This will be available in the compiled plugin as well as in HISE. The use cases for this will be TBD (and I'm open to suggestions for possible applications), but I can imagine there will be both a scripting API for processing input data as well as a scriptnode node that will run the network on either the audio signal or cable level (a bit like
-
@Christoph-Hart This is pretty awesome news. I agree that RTNeural looks like a good choice to add as a player. I wouldn't be completely bummed if we had to train models outside of HISE, but the integration of PyTorch would be much welcomed to streamline the workflow.
I would most likely be training models of basic things (like guitar amps) but also component level models. -
i'll just leave this one here, the most impressive use-case i've seen so far
-
@Christoph-Hart said in Simple ML neural network:
no need to learn Python and it's weird whitespace syntax lol
i like Python
-
@Christoph-Hart WOW!!! Nice surprise,
I can think of ML HISE playing the same role it did for non JUCE C++ experts in dsp developing arena. DDSP or RAVE also will be nice to have onboard.
We all are waiting for your news!! -
Alright maybe I‘ll hold off with the HISE network training part and dust off my Python skillset, the pipeline is just to advanced to not use it for model building.
DDSP and stuff is nice but I think I need to add „conventional“ neural networks first, then we can expand on that (basically fast forwarding the last 30 years of development in this area lol).
-
@Christoph-Hart Jatin (the guy who made RTNeural) is super friendly and knowledgeable, i'm sure he'd be happy to answer any questions you have about recurrent models and such if you haven't touched base with him already
i don't think training inside HISE would be particularly useful since it would be limited to the CPU anyway right?
-
@iamlamprey No libtorch has GPU support. But I just discovered TorchStudio, that's precisely the kind of GUI wrapper I needed to avoid the frustrating Python first steps stage...
-
@Christoph-Hart Alright, the first experimental integration is pushed. You can now load neural networks using RTNeural and either inference it using the scripting API or run it as realtime effect using the
math.neural
node. I've created an example project with some hello world stuff and a roundtrip tutorial for getting started with TorchStudio:https://github.com/christophhart/hise_tutorial/tree/master/NeuralNetworkExample
This is far from being production ready but it should be good enough for playing around and let me know what features might be interesting to add.
-
@Christoph-Hart oh shit! There goes my weekend plans. R.I.P my marriage
-
@Christoph-Hart Awesome. My knowledge of Neural Networks is close to 0 but this is a good opportunity to learn something new. Thank you genius!
-
checking this out now, very exciting stuff. I also appreciate you skipping over MNIST in the roundtrip example
-
@Christoph-Hart Thanks for this
-
https://github.com/christophhart/HISE/commit/a295c6c31d7a44e5323cbfde67395131b223a2b4
:beaming_face_with_smiling_eyes:
-
@Christoph-Hart said in Simple ML neural network:
A very powerful boost for this functionality could be the Loris library, I can imagine that having an array of highly precise time-varying gain values associated to the harmonic index is a much better input data than these few pixels from a spectrogram
Have you had a chance to play around with this particular use-case? Having the neural node handle the greyishbox modelling sounds a lot more streamlined than fine tuning an additive synth
-
@iamlamprey nope, I‘ve suspended my journey into ML until I have a real use case for it :)
If anyone is using this stuff I‘m happy to implement new features or fix issues, but now that the „hello world“ is implemented I expect it to grow with actual projects and their requirement.
-
@Christoph-Hart yeah fair enough, i'll keep noodling on this additive synth but i'll definitely be trying the neural & loris pairing at some point
-
@Christoph-Hart do you happen to have the training/dataloader for the sine model handy? the repo only has tanh
i'm currently migrating the example over to pure torch because torchstudio keeps uninstalling my local packages and in general is kinda gross