If people's voices become copyright protected in the future as a response to AI, will any non-commercial uses of voices be affected as well?

ryujin470@fedia.io · 2 days ago

If people's voices become copyright protected in the future as a response to AI, will any non-commercial uses of voices be affected as well?

Uli@sopuli.xyz · 1 day ago

While you are correct about copyright on this subject, the more applicable topic here is Right of Publicity. It is state law in over half of US states, intended to protect the use of a person’s voice likeness.

Essentially, if an imitation voice is used in such a way that it could cause confusion about whether it is really the imitated person, then it is illegal to use it in any commercial context. I understand that the question here was about non-commercial contexts, but that line can get blurry when social media views can create followings that then translate into commercial success. I am not a lawyer by any means, I’ve just been researching this for my own AI voices applications and want to protect myself from accidentally imitating anyone.

For example, I need to be able to transform my voice into many other character voices, since I have so many lines to record it would be cost prohibitive to hire actors. The worst move would be to download a voice model of a known actor and use that directly. Very sketchy, both legally and ethically.

So, the next best move is to find three or four voice models and merge them into one with combined tensor data from all three. But I was still quite concerned about this, worried that in the many thousands of voice lines I make, some recognizable actor voices would slip through.

So, I came up with the following pattern that I feel much more comfortable with, both legally and ethically:

I download several voice models that have some quality in common - an accent, vocal timbre, or style of speaking. Then, I merge them to make a model that focuses on that trait. And I record myself saying a line with a lot of phoneme variety, trying to match the vocal trait as close as possible. Then, that merged vocal trait model is used to transform the recording of my voice into the new voice. Then, I use this transformed recording to train a new voice model. And I take a few of these generalized models (e.g. an accent, a tone, a speaking style) and use them to create the final character voice, which should in theory be far removed from any of the actors who contributed.

I’m not sure what OP’s use case is, if it’s truly non-commercial, this method might be overkill. But if anyone wants to try using AI voices in projects but is nervous about legal ramifications, this is one way to try to insulate created voices from the specific training data. YMMV.