I can’t find an okish TTS to use. I tried espeak; I hate the way it sounds, though I could customize the sound.
I think mozilla TTS is the best. But I think this questions should have been asked in Open Source.
My understanding is that Mozilla is continuing to build the CommonVoice dataset for training speech models, but they are no longer developing TTS or STT software themselves.
https://github.com/coqui-ai/TTS is the new home of what was Mozilla’s TTS project. Coqui is a new company where some of the former mozilla speech team ended up. Coqui is continuing to develop both the TTS and STT code and models.
There are a number of other much older free software TTS options, but Coqui’s (formerly Mozilla’s) is by far the best one I’ve heard.
Is there a way to install these on android?
I went looking and found this which implies that the TTS isn’t working on android yet, and this which indicates the STT library does work on android but they have only a very simple and limited demo app so far.
I also found this Voice-Cloning repo which says it has an android app that uses Tacotron2 (one of the models coqui uses, which comes from Google) to do voice cloning… which sounds promising, but I don’t see an apk or build instructions.
Thanks for the answer, sadly it’s exactly what I expected: there is still no way to have a decent working TTS-STT system on android (without Google, of course)
Wow coqui sounds really good, i’m gonna have find a command line thing of that.
Here is page with samples, it sounds pretty good: https://erogol.github.io/ddc-samples/
You can only use small sentences when using coqui-TTS. They should mention that somewhere in the repo, but they don’t. I thought it would be just passing a text file and getting an audio file like
tts < input.txt > output.wav
. But it only works if the text file is divided in small enough sentences which makes it impractical for most cases.I didn’t notice that when i tried it before but now I see what you mean… that is really irritating :(
Also, just now I tried to have it just speak the word “hello” (no punctuation) and got something like “hello oh oh oh oh” with a bit of tonal variation in the strange sounds at the end. So, yeah, I guess they’ve got a ways to go still. Other short phrases I’m trying have good results, but somehow “hello” produces these odd sounds.
I agree.