Neural Audio Codecs & (Residual) Vector Quantization
The technology behind State-of-the-Art Audio AI models
By Francesco Cariaggi
In this blog post, I’ll take you through two important concepts behind modern Audio AI models such as Google’s AudioLM and VALL-E, Meta’s AudioGen and MusicGen, Microsoft’s NaturalSpeech 2, Suno’s Bark, Kyutai’s Moshi and Hibiki, and many more: Neural Audio Codecs and (Residual) Vector Quantization.
[Read More]