Meta has announced the launch of AudioCraft, an open-source framework for generating high-quality and realistic audio and music from short text descriptions. This marks a significant step forward in Meta's venture into the domain of audio generation, having previously open-sourced an AI-powered music generator, MusicGen, in June.
The AudioCraft framework is designed to make the use of generative models for audio more accessible compared to previous efforts in the field. It provides a collection of sound and music generators and compression algorithms that can be used to create and encode songs and audio.
AudioCraft comprises three generative AI models: MusicGen, AudioGen, and EnCodec. MusicGen, which isn't new, has had its training code released by Meta, enabling users to train the model on their own dataset of music. This, however, could potentially raise ethical and legal issues, as MusicGen learns from existing music to produce similar effects.
AudioGen focuses on generating environmental sounds and sound effects, whereas EnCodec improves on previous Meta models for generating music with fewer artifacts.
While Meta sees potential benefits for musicians in terms of inspiration and composition, the advent of image and text generators has demonstrated the existence of drawbacks and potential legal challenges. Despite this, Meta has indicated it will continue to explore ways to improve the performance and control of generative audio models.