We want to hear from you! Take our quick AI survey and share your thoughts on the current state of AI, how you’re implementing it, and what you’re hoping for in the future. Learn more
ElevenLabs, the voice AI startup known for its voice cloning, speech synthesis, and speech-to-speech models, has just added another tool to its product portfolio: an AI voice isolator.
Available on the ElevenLabs platform starting today, the offering allows creators to remove ambient noise and unwanted sounds from any content they have, whether it’s a film, podcast or YouTube video.
This feature follows the company’s launch of a Reader app, and it’s free to use (with some limitations). However, users should also note that this feature isn’t entirely new to the market. Many other creative solution providers, including Adobe, offer tools to improve the quality of speech in content. The only thing that remains to be seen is how effective Voice Isolator will be compared to them.
How will AI voice isolator work?
When recording content like a movie, podcast, or interview, creators often face the problem of background noise, where unwanted sounds interfere with the content (think random people talking, the wind blowing, or a vehicle passing on the road). These noises may not be noticed during filming, but can affect the quality of the final result, mainly by sometimes removing the speaker’s voice.
Countdown to VB Transform 2024
Join business leaders in San Francisco July 9-11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. Register Now
To solve this problem, many people tend to use noise-canceling microphones that remove background noise during the recording phase itself. They get the job done, but may not be affordable in many cases, especially for early-career creators with limited resources. That’s where AI-powered tools like ElevenLabs’ new Voice Isolator come in.
Basically, the product works in post-production, where the user only has to upload the content they want to improve. Once the file is uploaded, the underlying models process it, detect and remove unwanted noise, and extract clear dialogues as output.
ElevenLabs claims that the product extracts speech with a level of quality similar to that of studio-recorded content. The company’s design lead Ammaar Reshi also shared a demo where the tool can be seen removing the noise of a leaf blower to extract the speaker’s clear speech.
We ran three tests to test the applicability of the voice isolator in the real world. In the first, we spoke three separate sentences, each disrupted by different background noises, while the other two included three sentences with a mixture of different noises occurring at random times, in an irregular manner.
In all cases, the tool was able to process the audio in a matter of seconds. More importantly, it removed noises—whether it was doors opening or closing, table banging, hand clapping, or household objects moving—in almost every case and extracted clear speech, without any distortion. The only sounds it failed to recognize and remove were the wall banging and finger snapping.
Sam Sklar, who manages growth for the company, also told us that it doesn’t work on musical vocals at this point, but users can try it on that use case and may have success with certain songs.
Improvements are probably underway
While Voice Isolator’s ability to remove irregular background noise certainly sets it apart from most other tools that only work with flat noises, there is still room for improvement. Hopefully, like all other tools, ElevenLabs will improve its performance further.
It’s important to note here that the company hasn’t shared much information about the underlying models that power the tool or whether the recordings fed into it are used to train its models in any way. Sklar said he couldn’t share specifics about what goes into creating the model, but noted that the company has a form linked to its privacy policy where users can opt out of the use of personal data for training purposes.
For now, the company is offering Voice Isolator only through its platform. It plans to open up API access in the coming weeks, though the exact timeline remains unclear. For users who go to the website or app to test the tool, ElevenLabs is offering free access with some usage limits.
“The Voice Isolator model costs 1,000 characters per minute of audio. We have a free plan on our site that includes 10,000 characters per month, so you can use it with 10 minutes of audio per month for free,” Sklar says. That means users looking to remove background noise from larger audio files will have to upgrade to paid plans starting at $5/month, billed monthly.
Source link