• 10 Posts
  • 44 Comments
Joined 4 years ago
cake
Cake day: March 11th, 2021

help-circle








  • [Longer version]

    Thanks to Common Voice contributors, Mozilla and @wannaphong@lemmy.ml , now we have a Wav2vec2 model for recognizing Thai speech available by training a wav2vec2 model on the Common Voice dataset. Now, I can use the model to convert my speech to text on the Huggingface website. It works accurately. I love it.

    However, using speech-to-text on the Huggingface website seems to be for testing. I want to use it instead of typing on LibreOffice or Firefox. I did some explorations, but I didn’t find anything that I could use.

    Is there any speech recognition software on GNU/Linux which will work with a wav2vec2 model?



















  • I agree that using JavaScript increases the chance of participation. I released a few versions of Thai word breakers in different programming languages. One on node.js is the most popular. 8 people contributed to the JS-based project compared to 2-3 people in other programming languages. However, JS has a downside too. In 2017, @iporsut and I made an experiment to compare Thai word breakers that we created. JS version running time is 15X of the Rust version. Even by comparing with another dynamic language, the Julia version is faster than the one in JS.

    I created a website using node.js in 2014, and it is still running. The performance is good. However, I have a few regrets.

    • We had a very hard time by install this project on other team members who use Windows 10 because we didn’t know how to build a Bcrypt library.
    • Recently, I have to fix the project without adding any new feature because Express.js was changed, MongoDB was changed, and some packages that I used were abandoned.
    • It was a small project so I wanted to keep the session storage in RAM, but I can’t since I ran 4 node.js processes. Now the project requires Redis as session storage, which causes more troubles for team members, who don’t familiar with GNU/Linux, Docker, or WSL.