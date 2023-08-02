Meta is taking on generative AI’s audio white whale.
While there’s already dozens of solutions for AI text and images, speech synthesis is currently the most common and open application for audio AI. Meta’s new AudioCraft tool aims to change that, providing users with full musical soundtracks.
“Imagine a professional musician being able to explore new compositions without having to play a single note on an instrument,” Meta wrote in a blog post announcing the tool.
“That’s the promise of AudioCraft — our latest AI tool that generates high-quality, realistic audio from music and text," the post adds.
AudioCraft utilizes three Meta-developed AI models, all of which were trained on either public domain or Meta-owned and licensed sources. While Mark Zuckerberg's social media giant is pushing music first and foremost, AudioCraft also has pre-trained models for generating “sound effects like a dog barking, cars honking, or footsteps on a wooden floor.”
Like Meta’s Llama 2 language model and Voicebox text-to-speech model, AudioCraft’s source models are open source, with Meta leaning on external developers to help MusicGen “turn into a new type of instrument — just like synthesizers when they first appeared.”
AudioCraft currently has a number of sample tracks and sound effects available for listening in a blog post, though all are noticeably instrumental, likely to avoid legal issues.
Select media outlets got early access to additional samples, with The Verge giving an overall impression that “AudioCraft sounds like something that could be used for elevator music or stock songs that can be plugged in for some atmosphere rather than the next big pop hit.”
Meta isn’t the first major tech company to tackle AI audio, but alternate solutions like Google’s MusicLM are still limiting access to select users.
