Spanish fine-tune (pre-alpha)

#38
by andromeda0302 - opened

This is a really early version of an Spanish fine-tune. I am training it on around 150 hours of spanish audio databases with permissive licenses.
It has only been trained for 8 hours on one 3090. It's still bad but now it has spanish understanding.
I am sending it in case you are interested in adding it as an optional checkpoint. Also if you are interested I may send the next versions as it continues to train.
This may serve as a placeholder while no other options are available.

This are the URLs of the datasets I used:

https://www.kaggle.com/datasets/carlfm01/120h-spanish-speech
https://www.kaggle.com/datasets/bryanpark/spanish-single-speaker-speech-dataset?resource=download
http://openslr.org/61/
http://openslr.org/71/
https://www.openslr.org/72/
http://openslr.org/73/
http://openslr.org/74/
http://openslr.org/75/

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment