Agentic coding workflows
-
@Christoph-Hart said in Agentic coding workflows:
Oh boy where to begin...
-
All LLMs are not great with HISEscript still. Claude is the best imo but is still prone to making the most basic mistakes (using var instead of local is the most egregious). The newer models are better at adapting their behaviour but are still imperfect. How you are going to bridge that gap is beyond me (the models can only work from the information they have, and there isn't enough information on HISEscript). Personally what I have done is flatten the HISE docs and HISE source code into text files and make the LLM revisit the documentation when it starts to hallucinate. Even with this approach, there's still a lot of hand-holding required.
-
API Navigation: Continuing from there, the models can be pretty decent at navigating the different APIs when necessary and can describe/use them with a decent amount of accuracy. Also useful when the docs are incomplete or something isn't working as it should (useful co-developer).
-
Integration: This is really clunky at the moment and I am a little behind on adopting Claude Code when it comes to HISE projects. The newer models are pretty incredible for having larger project-context awareness, which is what is making things like iPlug3 a real possibility. My old system usually involved creating small projects in Claude and uploading the main scripts of my project. Opus 4 was extremely good at solving individual tasks ("create a LAF for a knob that is a squarebox and responds to mouse interactions", "create a dropdown menu that loads samples using a panel"). This is massive timesaver for redundant tasks handled individually but would often introduce bugs and breakdown when dealing with larger contexts (a full custom preset browser system, an authorization system, etc.)
-
Pipeline: Hugggeee timesaver and probably the first thing worth looking into here. Some of us discussed this recently but the way you can get certain models (Claude, Gemini, ChatGPT) to write bash scripts and solve the rest of the development pipeline has been a game changer. I have scripts now that just let me compile the plugin for whatever format (AAX/AU/VST3/Standalone), codesign them, place them in the right locations, autogenerate the packaging, etc. This took me like half a day to setup on both systems and now I never need to look at this tedious part of the pipeline again. Likewise if I need to perform batch processing on the XMLs it's super useful here (change the PROJECT_FOLDER to an EXP, reposition/duplicate components on the UI, autogenerate headers to incorporate external DSP). I suspect as Claude Code or an equivalent becomes more integrated such tasks will become even simpler.
-
DSP: Because my C++ skills aren't so great I am able to leverage the LLMs to help me here. There are severe shortcomings: typically I need to direct the model to specific repos and more importantly, papers on a particular topic. From there you can start to go back and forth with a model on DSP development and prototyping (@sletz has shared some good examples with Faust).
Summary:
-
Even if AI is guiding you, you still nee a decent grasp of DSP and HISE knowledge to guide them effectively. AI accelerates tasks you already understand, can help you learn new concepts, but still requires careful review and skepticism.
-
Workflow and integration need a huge overhaul for a more seamless integration. I've tried a few setups to be able to use Claude Code with a HISE project but nothing feels comfortable currently. Running Claude Code, editing your scripts in an external IDE while recompiling in HISE is convulted.
-
Pipeline: This is low hanging fruit and is a good place to start the overhaul of HISE I think. I'm glad to see the reworking you have made to installing and setting up HISE. The same can be done on the other end for compiling and distributing the plugin. Likewise the possibility of adding more tools for plugin validation, latency correction, sample distribution, version compatibility can all become streamlined.
Sidenote: The XML/node structure of HISE is ripe for AI collaboration I believe. I haven't tried it but if the documentation was there I am sure you could guide a model to generate full DSP Networks in scriptnode and setup a basic project structure in the modulation tree, with the components autogenerated on the UI.
-
-
@HISEnberg I need to watch out that I don't start like my LLM responding to my input, but "Excellent feedback!" :)
All LLMs are not great with HISEscript still.
The knowledge gap between HiseScript and any other programming language is a temporary one. Sure only the big models somehow are fed with HiseScript and can one-shot you scripts that are (mostly) correct without any guidance, but that is precisely the pre-agent state of the art that I assumed will guard us from the robots taking over.
I think I need to catch you guys a bit up with what I did last week so that you can fully understand the fundamental change this will bring: the key to solving the hallucination problem (again, which I thought is deeply embedded in the very nature of LLMs) is to ensure that the context window of the LLM contains either the full set of API methods, language paradigms and properties (wasteful), or instruct the LLM to always query an existing set of resources before writing a single line of code.
There are multiple ways to achieve that efficiently and I came up with writing a custom MCP server that the agent can query and fetch the correct API method names & signature as well as component properties, LAF obj properties. including description. Then a reference guideline just needs to say: "WHENEVER YOU WRITE LAF CODE, LOOK UP THE FUNCTION NAME AND PROPERTIES!!!!!", give it the source and the LLM will lazily pull whatever it needs into the context window.
https://github.com/christoph-hart/hise_mcp_server
This provides the LLM with the always up to date (because automatically extracted from the source code) property list in a JSON format, eg. here for the LAF properties:
https://github.com/christoph-hart/hise_mcp_server/blob/master/data/laf_style_guide.json
Then whenever the user prompts something blablabla with LAF, it will pull in this style guide:
read it and understand what it can call.
Running Claude Code, editing your scripts in an external IDE while recompiling in HISE is convulted.
The MCP server can talk to HISE via a Rest API and send recompile messages, then get the output back as HTTP response and change the code until it compiles. This is the core development loop that powers that entire thing:
-
The XML/node structure of HISE is ripe for AI collaboration I believe.
Actually, the XML structure is not the best target for this as this is a stale file that doesn't reflect the actual state in the HISE instance. The MCP server can query the entire component tree and fetch the properties, so you can select (!) some buttons in HISE just tell him:
"Move the button row to the top"
The agent will then call
hise_runtime_get_selected_components, which returns a list of JSON objects for each selected component (like the actual selection (!) you did in the interface designer), apply the modification and send it back usinghise_runtime_set_component_properties. then callhise_runtime_get_component_propertiesto fetch the current state again and verify it against it's expected position. This is a trivial example and the verification loop is unnecessary here, but you can imagine how that scales up to be a real workflow boost. -
@Christoph-Hart I find ChatGPT to be surprisingly good at generating HISE scripts, but that might be the accumulation of dozens of prompts and my resulting corrections.
Also, I use Claude in a Cursor project of the full HISE repo. I ask it things like "In HISE script, how would I do XYZ?" and it usually gives a very good answer.
I believe that asking a model that has access to BOTH the full HISE source AND your current project would produce high-quality results.
-
@Christoph-Hart One of the reasons that AI coding works so well with Ruby on Rails web apps is because Rails has a huge amount of convention. It's heavily opinionated, uses standard libraries for lots of stuff and has a clearly defined directory structure.
This means AI spends way less time (and tokens) deciding HOW and WHERE to implement a feature than in something like a vanilla Node project, etc.
HISE has a similarly opinionated and well-defined structure, so I think that can be used to our advantage.
-
I'll also say that although I've seen many horror stories about how much people are spending on AI coding subscriptions, I'm working very happily within a $20/month Cursor plan.
I use Cursor for 2-4 hours a day, pretty much every day, and I haven't hit the plan max since I started paying 18 months ago (Aug 2024).
I do think the fact that I'm mostly using it to work on a Rails app (see post above) helps moderate token usage, and also I keep conversations fairly short, moving to a new chat when the context changes.
Thorough rules files help too, although I'm still fighting to get it to follow certain rules. Seems a bit picky!
-
@dannytaurus lol I burned through the 100 bucks Claude Max subscription in 3 days.
-
@Christoph-Hart Haha, you're in the honeymoon period!

And I suspect you're diving way deeper with it than most of us are.
-
@Christoph-Hart Woah I am gonna have to try this out

-
@HISEnberg sure I'm holding back about 50-60 commits with that stuff, but as soon as the MVP is solid I'll let you all play with this. The current feature I'm toying around with is the ability to create a list of semantic events that will create a automated mouse interaction:
{ { type: "click", id: "Button1", delay: 0 } { type: "drag", id: "Knob1", delta: [50, 60] , delay: 500 }, { type: "screenshot", crop: "Knob1" } }AI agents sends this list to HISE, HISE opens a new window with an interface, simulates the mouse interaction that clicks on button1 ( had to hack the low-level mouse handling in JUCE to simulate a real input device for that), then drags knob1 and makes a screenshot. Evaluates that the expected screenshot uses the code path of the LAF it has send to HISE previously, then realizes that the path bounds need to be scaled differently and retries it until the screenshot matches its expected outcome. If it just needs to know whether the knob drag updated the reverb parameter, it fetches the module list from HISE and looks at the SimpleReverb's attribute directly - this won't waste context window space for the image analysis.
It's that feedback loop that allows it to autonomously iterate to the "correct" solution (aka ralph loop), so even if it only gets it 90% of the time right, it will approximate to 99.9% after three rounds of this.