Generating AI Music with Julian Parker (Stability AI, ex-TikTok, ex-Native Instruments) | WolfTalk #025

Posted by Jan Wilczek on December 08, 2024 · 7 mins read

Learn behind-the-scenes of the most famous audio companies!

Powered by RedCircle

Please accept statistics and marketing cookies to access the podcast player.

Listen on

All podcast episodes.

Sign up for WolfSound’s newsletter!

Introduction

When I was doing my master’s at the University of Erlangen-Nürnberg, I was eager to learn the internals of the audio programming industry. My supervisor and mentor back then, Maximilian Schäfer, PhD, introduced me to Julian Parker, PhD, who was working at Native Instruments (NI) in Berlin at the time (this was before Native Instruments merged with iZotope and a few other audio brands).

Thanks to Julian, I learned the behind-the-scenes of the audio plugin development scene in a few eye-to-eye conversations, which we are happy to summarize publicly in a podcast episode!

Julian’s career is incredible: from natural sciences through a master’s in physical modeling, a PhD in virtual analog modeling, an almost decade-long position at Native Instruments, all the way to TikTok and Stability AI, where he’s working now on generative music algorithms.

There are few people who have such a rich background in audio research and industry and even fewer who are willing to share the details of it publicly. That makes this episode all the more exciting!

Note: If you like the podcast so far, please, go to Apple Podcasts and leave me a review there. You can do so on Spotify as well. It will benefit both sides: more reviews mean a broader reach on Apple Podcasts and feedback can help me to improve the show and provide better quality content to you. You can also subscribe and give a like on YouTube. Thank you for doing this 🙏

Episode contents

From this podcast, you will learn:

  • how machine learning forever changed audio plugin design and development
  • how big audio plugin companies operate internally
  • how to learn C++ for audio programming
  • whether you need to have a PhD to work in an R&D department of an audio company
  • what is the state of the art in generative music
  • how to learn generating music with AI
  • how to be able to focus on research papers even if you read them after hours
  • how to produce quality research
  • how to rest & recharge after intense and focused work

This episode was recorded on June 26, 2024.

Julian’s Tips & Observations For Successful DSP Research & Development

  1. Best ideas often come in the moments outside of work (during a walk, a shower, etc.).
  2. Published research is a sign of competence.
  3. Neural networks are like digital signal processing blocks.
  4. To create novel research, learn all the existing approaches and think about what you can do better.
  5. All of Julian’s favorite works are the ones published for free.
  6. People like demo videos.
  7. If you want to learn generative AI, go out there and start playing with open-source and open-weight models like LLaMA or Stable Audio Open. Run inference, tweak.
  8. C++ is still important for DSP & fast ML inference (low-level, high-performance programming).
  9. Try to maximize the day-to-day joy from what you’re doing.
  10. If you teach something, you need to learn it very deeply (which is a modus operandi of this blog, by the way).
  11. Focus in research comes from interest.
  12. If you work on something hard, focus on your work for a short time and relax completely afterward.
  13. Writing DSP is a great way to learn a programming language.
  14. Steps in VIrtual Analog modeling:
    1. Maths
    2. Coding
    3. Tuning
    4. Optimization

References

  1. People
    1. Julian D. Parker
      1. MSc thesis on spring reverberation [PDF]
      2. PhD thesis on dispersive systems
      3. LinkedIn
      4. Harmonai Discord
    2. Stefan Bilbao
    3. Vesa Välimäki
    4. Julius O. Smith
    5. Lauri Savioja
    6. Jonathan Abel
    7. Sebastian Schlecht
    8. Mati Karjalainen
    9. Till Bovermann
    10. Manfred Schroeder
    11. Egbert Juergens
    12. Steinunn Arnardottir
    13. Colin Raffael (a co-author of T5 text embedding model)
    14. Fabian Esqueda
    15. Hans-Joachim “Eddie” Mond
    16. Ed Newton-Rex
    17. Lykon: a Twitter user training his image generation model with stable diffusion
    18. AK on Hugging Face
    19. Music artists
      1. AphexTwin
      2. Warp Records
      3. TJ Herz (objekt)
  2. Places
    1. University of Cambridge
      1. Natural Sciences undergraduate course
    2. University of Edinburgh
      1. MSc in Acoustics and Music Technology
      2. MSc in Sound Design
      3. MMus in Composition
    3. Aalto University (Acoustics Lab)
    4. Native Instruments
    5. Yamaha
    6. KORG
    7. Tik-Tok (ByteDance)
    8. Jukedeck led by Ed Newton-Rex (see above), bought out by Tik-Tok
    9. Lexicon
    10. Alesis
    11. Google
    12. Stability AI
    13. Freesound.org
    14. SONY
    15. arxiv.org
    16. Meta
    17. GitHub
    18. Digital Audio FX (DAFX) conference
    19. Hugging Face
  3. DSP concepts
    1. FM Synthesis
    2. Virtual Analog (VA)
    3. finite difference scheme
    4. allpass filter
    5. dynamic range compression
    6. dispersive systems: different frequencies travel in them at different speeds
    7. Music Information Retrieval (MIR)
    8. analytical signal (Hilbert transform) using a filterbank
    9. reverb
      1. Feedback Delay Network
    10. symbolic music generation
    11. audio generation
    12. AI music generation
    13. style transfer
    14. Machine Learning
    15. SIMD
  4. Hardware
    1. ZX Spectrum
    2. Yamaha SG
    3. KORG Prophecy
    4. spring reverb
    5. Buchla filter (see below for a paper)
    6. Clavia Nord Micro Modular (https://www.vintagesynth.com/clavia/nord-micro-modular)
    7. monome’s norns platform
    8. From Native Instruments
      1. Maschine
      2. Traktor
  5. Audio software
    1. JPverb reverb by Julian
      1. Source code
    2. Greyhole reverb by Julian
      1. Source code
    3. Realms plugin
    4. From Native Instruments
      1. Reaktor Blocks
      2. Reaktor User Library
      3. Replika XT
      4. Mod Pack
      5. Crush Pack
      6. Guitar Rig
      7. Traktor
      8. Raum reverb plugin
  6. Programming languages
    1. FAUST
    2. Super Collider
    3. C++
    4. Max for Live + Gen or Faust
    5. Python
  7. Libraries & frameworks
    1. Eigen C++ library
    2. JUCE C++ framework (podcast sponsor ♥️)
    3. Pytorch
    4. TensorFlow
    5. Keras
    6. jax from Google
    7. CUDA
  8. Research papers
    1. Manfred Schroeder’s original reverb paper [PDF]
    2. artificial reverberation review paper: “Fifty Years of Artificial Reverberation” by Julian D. Parker et. al.
    3. a follow-up to the above: “More Than 50 Years of Artificial Reverberation” by Vesa Välimäki et. al.
    4. “A digital model for the Buchla lowpass-gate” by Julian D. Parker and Stefano D’Angelo [PDF]
    5. Diff-a-Riff paper from SONY
    6. Stemgen paper by Julian D. Parker et. al.
  9. Image generation models
    1. DALL-E
    2. Stable Diffusion
    3. Mistral
    4. LLaMA
  10. Music generation models
    1. MusicLM
    2. Stable Audio 2.0
    3. Stable Audio Open (trained on CC-0- and CC-BY-licensed music)
    4. Stable Audio Tools

Thank you for listening!

Share this page on:

Comments powered by Talkyard.

Please accept marketing cookies to access the display and add comments.