Photo Credit: ElevenLabs
ElevenLabs, a New York-based artificial intelligence (AI) firm, released an application programming interface (API) for its Voice Design feature, which recently made its debut. The announcement came last week, and alongside, the company also introduced an open-source project dubbed X to Voice, which can generate a unique voice for an X (formerly known as Twitter) profile based on the posts of the user. The feature also shows a text prompt which is auto-generated based on the analysis of the profile.
In a blog post, ElevenLabs detailed the two new AI tools. The first is an API version of the Voice Design tool, which was recently introduced. Voice Design is a new capability developed by the company which can generate unique AI voices based on text prompts. These voices are based on the description shared by the user, including the pitch, timbre, delivery pace, intonation, and more.
Now, this feature is being made available via the company's API. This means developers can use this capability to build apps and software. Voice Design can either be offered by developers to develop voices for their AI characters or to users so that they can generate new voices for themselves.
The company has offered two endpoints. First allows developers to generate three unique voice previews based on a text prompt. The second allows them to save the voice previews to their library for local use. ElevenLabs did not highlight the price of the API or the cost per request of the AI model. Details about the AI model are also not known.
The second tool is the company's open-source project dubbed X to Voice. It is an extension of the feature available to test on a web client here. Users can add an X username and the AI will automatically analyse the profile including the bio and posts. Once analysed, it generates a text prompt on the basis of the analysis.
The text prompt is then fed to Voice Design automatically to generate a unique voice for the profile. Gadgets 360 tested out the feature and found that it takes between 30 seconds to a minute to generate voice previews for a profile. In total, three voice previews are generated. The AI voice speaks a line which is also based on the analysis of the profile.
Alongside the three voice previews, the page also displays the text prompt it used to generate the AI voice. We also found that the feature animates the profile pictures of users who have added a close of their face and syncs lip and mouth movements to match the words that are being spoken.
For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.