Phaselocking 2025 and other sampler related discussions

Christoph Hart

So,

I'm currently a bit stuck in the development of the new complex sampler module - it goes like it always goes and the feature set quickly exploded into something much more complex than the original idea...

I'm kind of stuck at the moment so maybe some input from some of you who have worked with legato samples in the past might chime in so we can figure out how to come up with a feature set that is most usable.

So I'm currently working on the legato layer, which is a dedicated group logic for handling legato transitions. Once all samples are assigned properly (the "group value" is simply the start transition note), it will automatically handle the sample selection to play the correct transition from the old to the new note with no scripting required.

However I realized (and this came also up in our last meetup) that the sample selection is the easy part and getting the legato transitions right involves a lot of custom tweaking and that's where the fun begins:

You have to stop the old note
Play the legato transition
Play the new sustain note (with a delay that is about as long as the legato transition)

This yields several parameters (the fade time between the samples, the length of the legato sample), etc. which in the traditional way of doing things (legato samples and sustain samples being loaded into separate samplers) is kind of doable by just adding some script processors in each sampler that perform this task.

Now the main problem that I'm facing currently is that the current scripting API does not offer control for handling different voices that are launched within the same sampler module and there's no easy solution to this without blowing up the architectural complexity even further. Usually I would just say "well, guys until here" and ditch the entire thing (or at least the legato layer) BUT there is one awesome feature that presents itself once you are putting the sustain samples and the legato samples into the same sampler (aside from the organisational benefits): With a little bit of computational overhead we can get access to the playback position of the sustain sample and then sync both the legato transitions as well as the delayed target note to be perfectly in sync with each other.

I've been working with artificial samples (just a sine wave with "legato transitions") and the transition lines up perfectly (after a weekend of debugging it also respects the HISE_EVENT_RASTER that aligns the midi position to multiple of 8s lol):

The top waveform is the sustain sample and the bottom one the legato sample. The legato layer will calculate the next zero crossing of the currently played sustain note and move the start of the legato sample to perfectly match the phase of the sustain sample. It also applies a fade to the start and end of each sample and the resulting sound in this artificial example is a perfect sine wave transition without any noticeable transitions between the samples. Noice noice noice.

So this is kind of the carrot that is hanging in front of my face that keeps me on pursuing this feature set as I think that this will highly improve the sound quality of legato samples and having this built into the engine would be awesome. However there are a few caveats:

Obviously the success rate of this function depends on the consistency of the samples. It's all fun and games with an artificially generated sine wave with perfect pitch, but with real sounds that have a fluctuating pitch the phase-syncing becomes impossible as it will fall apart within a few cycles. By pitchlocking (or phaselocking) the samples this can be mitigated though. I haven't played around with pitch / phaselocking for quite some time though so if anyone can bring me up to speed on what people are using for phaselocking their samples in 2025 it would be awesome.
Currently the entire functionality sits inside the engine and I'm not sure if I can expose this to the scripting layer for additional customizability - of course stuff like the fade time can be exposed (and if you want to seek into the transitions to make them more snappy I could make the engine respect the start offset (which again needs to be moved to the correct phase position internally) but that's about it. I would love to hear from anyone who has ever implemented legato transitions if there is something that I'm missing here? Of course there is polyphonic legato (see below), but apart from that have you implemented more features?
Polyphonic Legato. In order to support poly legato I would simply implement that functionality right into the legato layer module and expose its bypass state as well as a few selected internal parameters. It's been a while since I last wrote a polyphonic legato script (2012 lol), but IIRC there are not too many parameters that need to be customizable: it's actually just the chord detection threshhold that groups notes together that are closely played after each other, but the rest is following a pretty standardised rule set.

So long story short: in order to continue my quest for a perfect legato engine, please let me know these three things:

What are you using for pitch/phase-locking your samples and with what success rate?
Do you need more parameters exposed in the default legato transition mode
Do you need more parameters exposed for the poly legato engine?

Oh and if anyone has a nice sampleset that you want to share with me so I can check this stuff with real-world samples, it would be great. These samples will obviously not be published anywhere it's just for my internal development process.

David Healey

@Christoph-Hart said in Phaselocking 2025 and other sampler related discussions:

I haven't played around with pitch / phaselocking for quite some time though so if anyone can bring me up to speed on what people are using for phaselocking their samples in 2025 it would be awesome.

I have a custom tool (c++) that I commissioned specifically for phase-locking. I can share the repo with you if you're interested. It was actually written by Chris Cannam of RubberBand fame. It doesn't work so well with shorter samples though, it messes with the attack, but we were optimising for sustains so maybe this can be improved.

For some instruments I'm just resynthesising them with Loris and flattening the pitch.

How will this system work with multi-mic samples where the phase is all over the place? Will it just sync individual mics together?

Lindon

@Christoph-Hart said in Phaselocking 2025 and other sampler related discussions:

So long story short: in order to continue my quest for a perfect legato engine, please let me know these three things:

What are you using for pitch/phase-locking your samples and with what success rate?

Do you need more parameters exposed in the default legato transition mode

Do you need more parameters exposed for the poly legato engine?

Well to be honest Im in a somewhat different relationship with the audio..but it may work as a model - in that I say to the client - "your problem" and like Dave's explanation they then use some external 3rd party tool set to get the files into "as good a state" as they can and I essentially do what you've outlined above, with a bunch of manual tweaking to get it playback positions synced up as best I can - either with some global or a per individual transition rule set. Given you have a very very attractive potential solution for that part then Im all good "as is"
Nope all good.
Nope again all good.

Christoph Hart

@d-healey said in Phaselocking 2025 and other sampler related discussions:

How will this system work with multi-mic samples where the phase is all over the place? Will it just sync individual mics together?

I would assume that the phase relation is consistent across mic positions - it might have a different phase but since it's the same waveform that phase difference will stay constant, no?

So you can just align the first mic position and the other ones should follow suit. Ideally you would use the closest position as "guide".

Also there's still a lot work to be done on my side, one of the tasks is to calculate a more reliable way to detect the phase position that just look for a zero crossing from negative to positive as most real waveforms have more than one zero crossing per wave cycle.

tomekslesicki

@Christoph-Hart will it also be possible to disable the crossfade after the transition and stay on the legato transition sample? I’m thinking about a scenario where the samples are edited so that the transition sample also contains the sustian portion to loop.

Christoph Hart

@tomekslesicki Ah yes, that should be an option.

It currently works by assigning the "Ignore" flag to the sustain samples (so that they are played regardless of which legato transition is active), but I can add an option to not start the sustain sample of the target not if the legato transition already includes the next sustain phase.

tomekslesicki

@Christoph-Hart that would be super cool and flexible I think!

Simon

@Christoph-Hart Oh boy my favorite subject :)

First off, automatically aligning the legatos to sustains is one of those coveted features that's impossible to do in Kontakt, at least without squashing your samples to a consistent pitch (and rendering them unlistenable in the process, which you then remedy by re-applying a pitch curve). What you are working on is beautiful and super exciting.

RE: a few of your points.

Sine waves

This alignment will be most noticeable on solo instruments that are already closer to sine waves than to the chaos of an ensemble. I imagine testing with sine wavs is a totally suitable stand in, and aligning based on rising/falling zero crossings should already give very good results.

Legato Parameters

I would like to have access to:

Attack and release time, and curve, for source sustain, legato sample, target sustain.
Volume for each legato sample (I don't see any reason this would be removed but just in case. Maybe it's possible to include this in a "legato adjustment" ui?)

Most of the time you will want the crossfades to be the same length on both sides, ie. sourceSustain > legato will have an xfade of ~180ms, and this value will be adjusted for both. This is what I eventually settled on for Poeesia. However, sometimes having the source sustain play a little longer can help preserve ambience, eg. sourceSustain.Release = 220ms, legato.Attack = 180ms.

There might also be situations where you want release samples to have yet another setting, so targetSustain.release would change depending on whether you're playing to a new legato sample, or to a release sample.

Whether it's worth exposing all the parameters separately depends on how annoying it is for you to implement.

sourceSustain > legato, crossfadeTime
sourceSustain > legato, crossfadeShape
legato > targetSustain, crossfadeTime
legato > targetSustain, crossfadeShape

is probably sufficient, and in fact probably the nicest dev experience.

Legato length

What I loved about building Poeesia in HISE was I wasn't forced to choose a "best fit" setting for the legato sample length like you're forced to in Kontakt.

I just got the length of the legato sample from the samplemap, and delayed playback of the target sustain by that amount. I was then free to set unique start end end points (and volume) for every legato sample in the library. This is the main reason the instrument sounds better than it ever could in Kontakt.

Sustain offsest

Legato sounds much more natural if you jump very far into the target sustain. I know this is tricky in HISE with the maximum offset being relatively small, I forget the name of the remedy you and David came up with.

Phase locking sustains

I do not pitch/phase lock my samples because it's never really been an option, and is only really beneficial on solo instruments with multiple dynamics. As usual, doing it in Kontakt is a pain, though I know some devs have done it.

However, I assume using loris to flatten the pitch, and re-applying the extracted pitch curve from one dynamic layer to all dynamic layers using the sample editor envelope in HISE, would be an extremely pleasant workflow.

Even on raw, non-phase-locked samples, aligning the phase of sustains to legatos is very noticeable, just like it is for loops.

Polylegato

You probably know this already, but just to recap. Two different conditions should trigger a legato sample:

key released, and another one pressed soon after (jump legato)
key pressed, and another one released soon after (overlap legato)

Either one of these on their own is also technically polylegato, but it is much less playable for the enduser, to the point where I wouldn't include it in a library.

The only parameter that needs to be exposed is the "legato window", which also represents the added latency. I experimented with different times for jump/overlap legato, but settled on adjusting them together and exposing that setting in the UI of the instrument.

aaronventure

@Christoph-Hart Phase-locking will only work on very, very dry mono samples.

Wet samples are inherently impossible to phase-lock in a crossfade because the phase differences are a result of the pitch fluctuations in the sustained note.

Some instruments don't have this behavior, but the ones where you wanna have legato samples do. Another factor is the noise in the recording, either produced directly by the instrument, or noise from the mic and pre-amp, but the last two you can easily reduce by simply opening your wallet. The noise will affect the detection of the pitch fluctuations if your goal is to iron it out using brute-force.

So when you put an instrument into a hall and record them, then try to phase-lock layers or even legato samples, there are a few wrecking balls coming at you full-speed. Already the first one is unavoidable, though:

The pitch fluctuations (and noise) reverberate in the room, with their reverberation being recorded in the sustain of the sample. You can't phase match that, because at that point it's no longer a single voice with a single pitch; you have the original sound coming out of the instrument, and the reflected instances of that same sound with a slightly different pitch. If you move closer during the recording process to try and reduce the audible reverberation, you still have early reflections which are way more audible than the tail.

But even if you somehow survived the first wrecking ball (you won't), there's another:

Only close mics are mono, but not always; orchestral sections are often close-miked LCR (strings and woodwinds) and some engineers even prefer two close mics on horns and two on trumpets/trombones. Pianos are always stereo-miked, and harps and mallets are as well more often than not. Even if you manage to eliminate the room issue, all you need for your now seemingly sure phase-locking idea to fail is for the instrument to move, even just a little bit. You're coming back from a break and will now record the legato samples... The player accidentally moves the chair as they settle into it, and they lean a little bit the other way. The reflections all shift, slightly but they do, and the direct sound and the reflections now all have different timings on each of the mics in the main configuration. You can try phase-locking individual mics in the configuration, the sense of direction that they provide to an instrument is precisely because of how the reflections interact with the spaced microphones.

That means your phase-locked legato will require bone-dry mono samples (or near-bone-dry), otherwise it's useless. This is the flaw with conventionally sampled instruments that every composer working with libraries like that accepts and understands (except me, which is why I made Infinite Brass and why I continue working on the concept). The main problem here is then putting these instruments into a convincing space. There are already some solutions on the market that claim to offer it, but all require extensive tweaking. So this may affect how many devs go for this approach at this time.

The legato groups automatically handling all the logic is a great idea and will surely attract a lot of people. I think one of the best advances in conventional sampled legato was made by Cinematic Studio Series where they recorded the slowest legato (gliss/port. is separate) and then use timestretching to shrink it down for higher velocities with some smart editing.

Therefore, one parameter you should definitely consider adding is the stretch %, and have the option to allow the legato samples to be timestretched.

David Healey

@aaronventure said in Phaselocking 2025 and other sampler related discussions:

Phase-locking will only work on very, very dry mono samples.

It works with stereo samples too, you just treat them like two mono samples - the same as with multi-mics.

@aaronventure said in Phaselocking 2025 and other sampler related discussions:

Wet samples are inherently impossible to phase-lock in a crossfade because the phase differences

I just wanted to test this. Here is a trumpet C3 crossfade between 3 dynamics, original and phase-aligned. These are hall samples. They both sound pretty good actually thanks to the smoothing of the CC modulator. I think the crossfade sounds a little better on the aligned one but that's subjective. The attack of the aligned one is a little chorusy.

Original: [audio/oxf8dLtGB2PiIDx8iy6S9iS8.wav]

Aligned: [audio/l3jCXaUcKtG9Dhsmv_25Ctjt.wav]

@Christoph-Hart RR groups still have issues in the develop branch - when you import samples mapped to RR groups the little popup about creating groups (old vs new) has the wrong number and trying to adjust the RR group of a sample will move it to group 0 (which doesn't exist).

Peek 2025-05-14 15-09.gif

aaronventure

@d-healey said in Phaselocking 2025 and other sampler related discussions:

I just wanted to test this. Here is a trumpet C3 crossfade between 3 dynamics, original and phase-aligned. These are hall samples. They both sound pretty good actually thanks to the smoothing of the CC modulator. I think the crossfade sounds a little better on the aligned one but that's subjective. The attack of the aligned one is a little chorusy.

Both of these sound good. But that crossfade is very slow. How does it sound if you stay at the crossfade instead of driving the controller past it to fully fade in the other layer?

What about when you do a very fast crossfade? Because that's what'll happen when crossfading to a legato sample, and the room sound will be where it's the most obvious, and you can't remove it.

David Healey

@aaronventure said in Phaselocking 2025 and other sampler related discussions:

What about when you do a very fast crossfade?

original-fast.wav
aligned-fast.wav

Simon

@d-healey Audio embeds already work!

David Healey

@Simon Perfect, thanks, I didn't realise!

clevername27

@Christoph-Hart Are you sure you're solving the right problem?

Simon

@aaronventure Aligning legato samples to the sustains as Christoph is implementing certainly doesn't require bone dry nor pitch-locked samples. It's just like editing a loop, the crossfade happens over a few hundred ms, so the pitch variations don't have time to throw it more than a few degrees out of phase. Even on an ensemble the difference is subtle, but definitely audible.

As for pitch-squashing / phase-locking of different dynamic levels, you probably have the most experience with that of anyone here, and I agree with your analysis that it's practically useless on anything other than bone dry samples.

@clevername27 Depends what you mean by the "right" problem! Aligning legatos to sustains is a problem that every Kontakt developer who has ever done legato has encountered, and has no way of solving. Without switching to HISE, of course ;)

Christoph Hart

nice nice nice, good discussion guys.

This alignment will be most noticeable on solo instruments that are already closer to sine waves than to the chaos of an ensemble

Of course. In the end you probably enable this with solo instruments that are reasonable dry while deactivating it with big ensemble patches, but for the "sinesque" stuff it makes a drastic difference (hence my inquiry about phase-locking as this also is most useful for those kind of sounds.

Attack and release time, and curve,

Currently it's a single time for both the fade into and out of the legato sample, but I can easily make this two separate values. But do you really need to adjust the curve like in a real world scenario? I'm using the same internal technique as the usual Synth.addVolumeFade(), so in that case I would have to define a per-sampler curve for all volume fades.

There might also be situations where you want release samples to have yet another setting, so targetSustain.release would change depending on whether you're playing to a new legato sample, or to a release sample.

This is a subject for another discussion - the release transition will be handled by the specific release trigger layer that will come with its own set of properties (but these are much more clearer to me).

I just got the length of the legato sample from the samplemap, and delayed playback of the target sustain by that amount.

Yes that's precisely how it works at the moment and is hardcoded into the engine. It will not account for realtime-pitch modulation through modulators, but the sustain sample will be automatically delayed by the exact sample length of the legato sample.

Legato sounds much more natural if you jump very far into the target sustain. I know this is tricky in HISE with the maximum offset being relatively small, I forget the name of the remedy you and David came up with.

Yes that makes sense and the idea is an additional API method that can add an additional start offset to the sustain sample to make this work. The max offset is UINT16_MAX which would be 65536, if that isn't enough you can add a multiplier somehow, but I also forgot how to do that. David?

I think one of the best advances in conventional sampled legato was made by Cinematic Studio Series where they recorded the slowest legato (gliss/port. is separate) and then use timestretching to shrink it down for higher velocities with some smart editing.

Yeah, then we are back in CPU spike-land as my masterly deceitful trick of delaying the sample playback a few milliseconds until the timestretchers are initialised messes up the timing between legato & sustain samples.

I would assume that this process can be done offline and render a few versions to be layered across the velocity axis - I don't think that there's a need for more than eg. 5-6 prerendered legato speeds.

David Healey

@Christoph-Hart said in Phaselocking 2025 and other sampler related discussions:

but I also forgot how to do that. David?

https://github.com/christophhart/HISE/pull/680

aaronventure

@Christoph-Hart said in Phaselocking 2025 and other sampler related discussions:

I would assume that this process can be done offline and render a few versions to be layered across the velocity axis - I don't think that there's a need for more than eg. 5-6 prerendered legato speeds.

Is this something HISE could do? Could you make it less intensive on disk space?

Can you somehow precalculate 126 steps of timestretch without it being equal to rendering it out that many times?

If the answer is still rendering it to audio, this is one of those times having OPUS support would be very beneficial.

Simon

@Christoph-Hart Single time for fade into and fade out of legato sample is fine. Curve has been very important to adjust in the past, but I realize that's always been in the context of typical unaligned legato, where it might play in or out of phase.

So, going on the assumption that the phase-align-o-matic does its job, the crossfade should be linear as it is now, and probably doesn't need to be adjustable.

Phaselocking 2025 and other sampler related discussions

Sine waves

Legato Parameters

Legato length

Sustain offsest

Phase locking sustains

Polylegato

14

2.2k

13.5k

117.4k