Learn behind-the-scenes of the most famous audio companies!
Listen on
- 🎧 Spotify
- 🎥 YouTube
- 🎧 Apple Podcasts (iTunes)
- 🎧 TuneIn Radio
Sign up for WolfSound’s newsletter!
Introduction
When I was doing my master’s at the University of Erlangen-Nürnberg, I was eager to learn the internals of the audio programming industry. My supervisor and mentor back then, Maximilian Schäfer, PhD, introduced me to Julian Parker, PhD, who was working at Native Instruments (NI) in Berlin at the time (this was before Native Instruments merged with iZotope and a few other audio brands).
Thanks to Julian, I learned the behind-the-scenes of the audio plugin development scene in a few eye-to-eye conversations, which we are happy to summarize publicly in a podcast episode!
Julian’s career is incredible: from natural sciences through a master’s in physical modeling, a PhD in virtual analog modeling, an almost decade-long position at Native Instruments, all the way to TikTok and Stability AI, where he’s working now on generative music algorithms.
There are few people who have such a rich background in audio research and industry and even fewer who are willing to share the details of it publicly. That makes this episode all the more exciting!
Note: If you like the podcast so far, please, go to Apple Podcasts and leave me a review there. You can do so on Spotify as well. It will benefit both sides: more reviews mean a broader reach on Apple Podcasts and feedback can help me to improve the show and provide better quality content to you. You can also subscribe and give a like on YouTube. Thank you for doing this 🙏
Episode contents
From this podcast, you will learn:
- how machine learning forever changed audio plugin design and development
- how big audio plugin companies operate internally
- how to learn C++ for audio programming
- whether you need to have a PhD to work in an R&D department of an audio company
- what is the state of the art in generative music
- how to learn generating music with AI
- how to be able to focus on research papers even if you read them after hours
- how to produce quality research
- how to rest & recharge after intense and focused work
This episode was recorded on June 26, 2024.
Julian’s Tips & Observations For Successful DSP Research & Development
- Best ideas often come in the moments outside of work (during a walk, a shower, etc.).
- Published research is a sign of competence.
- Neural networks are like digital signal processing blocks.
- To create novel research, learn all the existing approaches and think about what you can do better.
- All of Julian’s favorite works are the ones published for free.
- People like demo videos.
- If you want to learn generative AI, go out there and start playing with open-source and open-weight models like LLaMA or Stable Audio Open. Run inference, tweak.
- C++ is still important for DSP & fast ML inference (low-level, high-performance programming).
- Try to maximize the day-to-day joy from what you’re doing.
- If you teach something, you need to learn it very deeply (which is a modus operandi of this blog, by the way).
- Focus in research comes from interest.
- If you work on something hard, focus on your work for a short time and relax completely afterward.
- Writing DSP is a great way to learn a programming language.
- Steps in VIrtual Analog modeling:
- Maths
- Coding
- Tuning
- Optimization
References
- People
- Julian D. Parker
- Stefan Bilbao
- Vesa Välimäki
- Julius O. Smith
- Lauri Savioja
- Jonathan Abel
- Sebastian Schlecht
- Mati Karjalainen
- Till Bovermann
- Manfred Schroeder
- Egbert Juergens
- Steinunn Arnardottir
- Colin Raffael (a co-author of T5 text embedding model)
- Fabian Esqueda
- Hans-Joachim “Eddie” Mond
- Ed Newton-Rex
- Lykon: a Twitter user training his image generation model with stable diffusion
- AK on Hugging Face
- Music artists
- AphexTwin
- Warp Records
- TJ Herz (objekt)
- Places
- University of Cambridge
- University of Edinburgh
- Aalto University (Acoustics Lab)
- Native Instruments
- Yamaha
- KORG
- Tik-Tok (ByteDance)
- Jukedeck led by Ed Newton-Rex (see above), bought out by Tik-Tok
- Lexicon
- Alesis
- Stability AI
- Freesound.org
- SONY
- arxiv.org
- Meta
- GitHub
- Digital Audio FX (DAFX) conference
- Hugging Face
- DSP concepts
- FM Synthesis
- Virtual Analog (VA)
- finite difference scheme
- allpass filter
- dynamic range compression
- dispersive systems: different frequencies travel in them at different speeds
- Music Information Retrieval (MIR)
- analytical signal (Hilbert transform) using a filterbank
- reverb
- Feedback Delay Network
- symbolic music generation
- audio generation
- AI music generation
- style transfer
- Machine Learning
- SIMD
- Hardware
- ZX Spectrum
- Yamaha SG
- KORG Prophecy
- spring reverb
- Buchla filter (see below for a paper)
- Clavia Nord Micro Modular (https://www.vintagesynth.com/clavia/nord-micro-modular)
- monome’s norns platform
- From Native Instruments
- Audio software
- Programming languages
- FAUST
- Super Collider
- C++
- Max for Live + Gen or Faust
- Python
- Libraries & frameworks
- Eigen C++ library
- JUCE C++ framework (podcast sponsor ♥️)
- Pytorch
- TensorFlow
- Keras
- jax from Google
- CUDA
- Research papers
- Manfred Schroeder’s original reverb paper [PDF]
- artificial reverberation review paper: “Fifty Years of Artificial Reverberation” by Julian D. Parker et. al.
- a follow-up to the above: “More Than 50 Years of Artificial Reverberation” by Vesa Välimäki et. al.
- “A digital model for the Buchla lowpass-gate” by Julian D. Parker and Stefano D’Angelo [PDF]
- Diff-a-Riff paper from SONY
- Stemgen paper by Julian D. Parker et. al.
- Image generation models
- DALL-E
- Stable Diffusion
- Mistral
- LLaMA
- Music generation models
Thank you for listening!
Comments powered by Talkyard.