Anyone Can Clone Your Voice Now

(huggingface.co)

43 points | by bakigul 2 days ago ago

31 comments

  • tefkah 2 days ago

    I struggle to find non-evil applications of voice-cloning. Maybe listening to your dead relative's voice one more time? But those use-cases seems so niche to the overwhelming use this will likely have: misinformation, scamming, putting voice actors out of work.

    • apwheele a day ago

      I would clone my own and do things like create scripted tutorials/presentations and audio books.

      I do not personally prefer it, but a non-trivial number of individuals like video/audio presentations over writing.

      • testing22321 a day ago

        I’m currently recording my books into audiobooks the old fashioned way. I wonder good indistinguishable this would be.

    • yellowapple 20 hours ago

      Video game mods are the first use case that comes to mind for me. If you want to add new voice lines for a character, your options are:

      1. don't (keep it silent);

      2. recruit the original VA somehow;

      3. recruit an alternative VA (who hopefully sounds close enough to the original to not be jarring);

      4. splice voice lines together from existing voice lines; or

      5. use text-to-speech.

      Voice cloning is just Option 5 but with results much closer to Options 2 and 3.

    • solarwindy a day ago

      The possibility to continue to sound like yourself after permanently losing your voice (e.g. from motor neurone syndrome) is one. Perhaps almost the only one.

    • socks a day ago

      At my workplace, a colleague in another team used an AI tool to voice/video clone my companies CEO, CRO and CTO (I assume with their permission) and created a mandatory 30 minute training video that they expected us to watch with these monotone fake company leaders doing the presentation. It wasn't even a joke.

    • suburban_strike 5 hours ago

      Parody, but it toes the line of evil in being fake.

    • schlupfknoten a day ago

      Voice acting for procedurally generated games?

    • c0balt a day ago

      Selling a voice profile for procedural/generated voice acting (similar to elevenlabs "voices") of a well-known person or pleasant sounding voice could be a legitimate use-case. But only iif actual consent is acquired first.

      Given that rights about ones likeness (Personality rights) are somewhat defined there might be a legitimate usecase here. For example, a user might prefer a TTS with the voice of a familiar presenter from TV over a generic voice.

      But it sounds exceedingly easy to abuse (similar to other generative AI applications) in order to exploit end-users (social engineering) and voice "providers" (exploitation of personality rights).

      • pogue a day ago

        Eleven Labs pays the estate of the people's voices they use, correct?

        I have their app on my phone and it will read articles in Burt Reynold's voice, Maya Angelou's voice & etc. I'm under the impression that they consented to this and their estate's are being compensated (hopefully).

    • dyauspitr a day ago

      If you want to make a podcast but don’t want to spend the time reading out the script and then painstakingly audio engineer every part of it.

      • kevinrineer a day ago

        Why would you want to make a podcast then? You don't need to offer a sub-optimal product if you don't want to make it.

        • dyauspitr a day ago

          I like writing out the scripts and doing the research. I don’t enjoy cleaning up the audio.

    • chistev a day ago

      Black mirror episode

    • nailer a day ago

      In Descript it is used to patch words in videos.

      Last year I proudly said it was "two thousand and five" during a video take, and didn't notice it at the time. I was able to add the "twenty" using Descript.

  • Bender a day ago

    Anyone Can Clone Your Voice Now

    A good reason to let any caller ID one does not recognize go to voice mail. Phones need an app that does voice-to-text-to-voice to prevent capturing the voice. I want mine to sound like Smeagol (LoTR).

    • nailer a day ago

      I'm pretty sure Caller ID is forgable and would lead to false negatives.

      • Bender a day ago

        Then the only winning move is not to play. Everyone goes to voice mail and or all communication is in person. Perhaps these fondle slabs were indeed a fad.

        • nailer a day ago

          Or just not use PSTN.

  • bigbuppo a day ago

    Yes, but can they be as annoying as I am? I think not.

  • pogue 2 days ago

    They couldn't already do that? Or is this new Qwen model just that much significantly better?