An AI Tool from Google Can Now Transform Captions Into Music

As the power of AI increases, algorithms are now capable of transforming ordinary text into images, animations, and even brief videos. Such algorithms have sparked considerable debate. The Getty Images stock photo library is currently pursuing legal action against AI developers because of this. Guess what? AI can now make music generated from text.

An AI Tool from Google Can Now Transform Captions Into Music

Transforming Text Into Music?

Researchers working for Google have developed an AI system that is perfectly capable of transforming ordinary text descriptions into rich, varied, and extremely relevant music. This capability has been demonstrated by the corporation through the generation of music, based on descriptions of well-known artworks. Amazing, and kind of scary.

The existence of big collections of images with accompanying descriptions is important to text-to-image systems. Then, these can be used to train a neural network. Nevertheless, comparable annotated datasets for music do not exist. In 2022, however, Google Research launched MuLan. It’s an algorithm that generates a written description of a piece of music. Generally, a decent text description must include the melody, rhythm, timbre, and numerous musical instruments and voices that the song can feature.

Using Multiple Databases for AI Training

Christian Frank and his colleagues at Google Research used MuLan to generate descriptive captions for music without copyright restrictions. Then, he used the database to train a second neural network to perform the opposite task of converting a caption into music. They refer to the new algorithm as MusicLM and demonstrate how it can make music from any given text or edit audio files like whistling or humming to match a caption.

Of course, the algorithm isn’t perfect… yet. One significant problem is that it uses the same biases as the data used to train it. This is a big issue. It raises difficulties regarding the appropriateness of music generation for cultures underrepresented in the training data, as well as problems over cultural appropriation. It’s impressive how far the whole AI thing has gone. And this is just the beginning.