HLAC Compression for CH1 files - not working out so well...



  • OK get the feeling the project I'm working on is going to really tax HISE....

    So I have (one of many) sample map with 27 round robin groups....

    There's a total of 4,239 wav files in there, mapped across these groups...

    They are all looped vocal samples...

    and on disk (as wav files) they add up to 2.54 GB

    So I've built this sample map in two different ways:

    First I loaded up each of the round robin groups with the required sounds - now some of these groups are duplicates of others - and they are duplicated so they match up with RR groups in other sample maps...their siblings if you will.

    Once built I then exported these as HLAC and got 2 ch1 files as a result:

    testmap.ch1 -- 2,036,988 KB
    testmap.ch2 -- 1,405,831 KB

    Thats 3,442,819 KB, or 3.25 Gb

    So for a start this looks odd - they are BIGGER than the wav file set they are supposed to be compressing....

    OK maybe the HLAC compression isnt doing the sensible thing and only compressing a given wav file the once...

    So I built the map again only this time removing all the duplicate groups, then back to HLAC and ch making and I get:

    reducedmap.ch1 -- 2,036,966 KB
    reducedmap.ch2 -- 182,746 KB

    OK so that's different - now I have 2,219712 KB in total for a Gb of 2.1Gb -- its not much compression either, less than 20%

    So two questions:

    Whats happening with that first scenario? Is it really not compressing each wav file just the once?

    Is it my audio data(the human voice) thats just "hard" for HLAC?

    In all cases by the way it took an AGE for HISE to compress these files (about 30 mins each) so tis doing a LOT of something...

    Kontakt can get every single bit of audio so about 40% more than I'm using here -down to "only" 1.8Gb in ncw format...



  • Hmm, there is a Duplicate flag in the samplemap XML that marks samples that are referenced more than once, and the HLAC encoder should skip these. Is this flag set in your samplemap?

    1. Is the source material 24bit and have you enabled TrueDynamics?
    2. The HLAC codec (and every other lossless audio codec) works best on decaying material because it can reduce the bit depth required to store the signal. For samples which are normalised and sustaining, the compression ratio is the worst, but it should at least yield a compression ratio of 70% (so your 2.54GB files should go to something like 1.8GB).


  • @Christoph-Hart said in HLAC Compression for CH1 files - not working out so well...:

    Hmm, there is a Duplicate flag in the samplemap XML that marks samples that are referenced more than once, and the HLAC encoder should skip these. Is this flag set in your samplemap?

    1. Is the source material 24bit and have you enabled TrueDynamics?
    2. The HLAC codec (and every other lossless audio codec) works best on decaying material because it can reduce the bit depth required to store the signal. For samples which are normalised and sustaining, the compression ratio is the worst, but it should at least yield a compression ratio of 70% (so your 2.54GB files should go to something like 1.8GB).

    yes the duplicate flag is set:

    <sample ID="30" LoVel="6" HiVel="15" FileName="{PROJECT_FOLDER}CT Ee_32.WAV"
              LoKey="84" HiKey="84" SampleStart="0" SampleEnd="132650" LoopEnabled="1"
              LoopStart="19820" LoopEnd="65674" Pitch="0" Root="84" Volume="-6"
              Pan="0" RRGroup="5" Duplicate="1" MonolithOffset="541708288"
    

    ..and yes 24-bit audio

    File Name Blue Vowel Ar_08.wav
    File Size 659 kB
    File Type WAV
    File Type Extension wav
    Mime Type audio/x-wav
    Originator Pro Tools
    Originator Reference 8QTvSdnSqKoaaaGk
    Date Time Original 2014:12:11 19:26:33 Time
    Reference 1959370 Bwf
    Version 0
    Encoding Microsoft PCM
    Num Channels 1
    Sample Rate 44100
    Avg Bytes Per Sec 132300
    Bits Per Sample 24
    Cue Points (Binary data 28 bytes)
    Duration 5.10 s
    Category audio

    I've also tried to use "True Dynamics" with very little change...



  • @Lindon -interestingly - yet more sfz problems...

    Loading an sfz now captures the lops but sets Duplicate ="1" in all cases...

    <sample ID="1" LoVel="46" HiVel="50" FileName="{PROJECT_FOLDER}CT Boot Boo FIXED_01.WAV"
              LoKey="53" HiKey="53" SampleStart="0" SampleEnd="158448" LoopEnabled="1"
              LoopStart="23385" LoopEnd="140829" Pitch="0" Root="53" Volume="-6"
              Pan="0" RRGroup="1" Duplicate="1"/>
    

    There are no duplicates in this sfz....



  • @Lindon OK more feedback - I went through an laboriously hand-edited the sample map to make sure all the entries were all correct - took me 3 hours...there were all sorts of strange duplications and anomalies - I still say this whole sample map making needs a serious work over -- and ran the compression again - and now I get a (very nice) 1,732,780 Kb - or 1.6Gb - so massive improvement...now all I have to do is go back and do the same to the remaining sample maps--- possibly days of work...



  • @Lindon Which part are you hand editing? The loop points?



  • @d-healey - no they are coming over fine now - Duplicates="X" is often wrong, and there are silly amounts of duplicated entries - so exactly the same meta data for a single file

    <sample ID="3" LoVel="46" HiVel="50" FileName="{PROJECT_FOLDER}CT Toost Too_04.WAV"
    LoKey="56" HiKey="56" SampleStart="0" SampleEnd="144671" LoopEnabled="1"
    LoopStart="30942" LoopEnd="117149" Pitch="0" Root="56" Volume="-6"
    Pan="0" RRGroup="1" Duplicate="0"/>

    duplicated elsewhere (nowhere near it) in the sample map, but with its duplicate="1" set ...



  • @Lindon Could you send me over one of the files so I can experiment to see if I can find a way to clean it up more quickly?



  • @d-healey well I just wrote over them!!! Let me see if I can find an original "broken" one...



  • @Lindon said in HLAC Compression for CH1 files - not working out so well...:

    possibly days of work...

    Nah, we'll make a fix for that one. What if there is a simple function that runs over the samplemaps and fixes the duplicate ID flag? This flag is not being used super prominently, so there might be many occasions where it gets corrupted (eg. if you delete a duplicate sample making the other one non-duplicate, etc).



  • @Christoph-Hart said in HLAC Compression for CH1 files - not working out so well...:

    @Lindon said in HLAC Compression for CH1 files - not working out so well...:

    possibly days of work...

    Nah, we'll make a fix for that one. What if there is a simple function that runs over the samplemaps and fixes the duplicate ID flag? This flag is not being used super prominently, so there might be many occasions where it gets corrupted (eg. if you delete a duplicate sample making the other one non-duplicate, etc).

    I'd trade this for the ability to select an rr group and HISE actually import dropped audio into it - not into group1..



  • Have you tried using the scripting API calls to actually do the mapping? If you have a ton of samples to map, this is usually the way to go as you can apply Regex rules to extract almost every property of a properly named sample file.

    Haters might argue this is the reason why the GUI based mapping is so sloppy - I never did it again after writing the scripting API for setting sample properties...



  • @Christoph-Hart never thought of doing it that way - which scripting API calls do you mean?





  • yeah -- sorta -- I can see how this would be useful if (As Christoph says in the thread) all the audio had comprehensive and inclusive file names (that included low note, high note, lo velo, hi velo, rr group etc..) - but frankly I nearly never get audio in that sort of format... its all "hand-named" using some folder structure.. and oft times comes from some pre-existing Kontakt product - so via sfz at best ---- .so I'm stuck with the drag-and-drop interface --I can usually work a way around (again usually with some python scripting to build the xml ) but it'd be nice if I could drag a set of audio into a sampler in a given rr group and have it map there correctly - but it never ever does.



  • If it's of any help, the folder name is also part of the sample's FileName property.

    Another helpful tool is a batch file rename utility - usually the samples that I work with have "some" sort of useable info in the filename which might be massaged into something that the regex processor can cope with. This is the way to go to clean up inconsistencies before the import process (and your samples might thank you anyways for getting more meaningful names along the process).



  • @Christoph-Hart said in HLAC Compression for CH1 files - not working out so well...:

    Another helpful tool is a batch file rename utility -

    This is my solution to badly named samples too. I use PyRenamer



  • @d-healey @Christoph-Hart -- oh trust me this is a life saver for me (well MASSIVE time saver at least):

    https://www.bulkrenameutility.co.uk/



  • yup, that's exactly what I'm using 🙂


Log in to reply
 

13
Online

648
Users

2.6k
Topics

21.5k
Posts