What FOSS TTS should I use?

Amicese@lemmy.ml · 4 years ago

What FOSS TTS should I use?

ajr@lemmy.ml · edit-2 4 years ago

I think mozilla TTS is the best. But I think this questions should have been asked in Open Source.

Arthur Besse@lemmy.ml · 4 years ago

My understanding is that Mozilla is continuing to build the CommonVoice dataset for training speech models, but they are no longer developing TTS or STT software themselves.

https://github.com/coqui-ai/TTS is the new home of what was Mozilla’s TTS project. Coqui is a new company where some of the former mozilla speech team ended up. Coqui is continuing to develop both the TTS and STT code and models.

There are a number of other much older free software TTS options, but Coqui’s (formerly Mozilla’s) is by far the best one I’ve heard.

Jama@lemmy.ml · 4 years ago

Is there a way to install these on android?

Arthur Besse@lemmy.ml · 4 years ago

I went looking and found this which implies that the TTS isn’t working on android yet, and this which indicates the STT library does work on android but they have only a very simple and limited demo app so far.

I also found this Voice-Cloning repo which says it has an android app that uses Tacotron2 (one of the models coqui uses, which comes from Google) to do voice cloning… which sounds promising, but I don’t see an apk or build instructions.

Jama@lemmy.ml · 4 years ago

Thanks for the answer, sadly it’s exactly what I expected: there is still no way to have a decent working TTS-STT system on android (without Google, of course)

Dessalines@lemmy.ml · 4 years ago

Wow coqui sounds really good, i’m gonna have find a command line thing of that.

monobot@lemmy.ml · 4 years ago

Here is page with samples, it sounds pretty good: https://erogol.github.io/ddc-samples/

Better_Rough_2554@lemmy.ml · edit-2 4 years ago

You can only use small sentences when using coqui-TTS. They should mention that somewhere in the repo, but they don’t. I thought it would be just passing a text file and getting an audio file like tts < input.txt > output.wav. But it only works if the text file is divided in small enough sentences which makes it impractical for most cases.

Arthur Besse@lemmy.ml · 4 years ago

I didn’t notice that when i tried it before but now I see what you mean… that is really irritating :(

Also, just now I tried to have it just speak the word “hello” (no punctuation) and got something like “hello oh oh oh oh” with a bit of tonal variation in the strange sounds at the end. So, yeah, I guess they’ve got a ways to go still. Other short phrases I’m trying have good results, but somehow “hello” produces these odd sounds.

Amicese@lemmy.ml · 4 years ago

I agree.