karolnowakowski's picture
Update README.md
c12c216
|
raw
history blame
849 Bytes
metadata
language:
  - multilingual
  - ain
license: apache-2.0

Wav2Vec2-Large-XLSR-53 pretrained on Ainu language data

This is a wav2vec-large-xlsr-53 model adapted for the Ainu language by performing continued pretraining for 100k steps on 234 hours of speech data in Hokkaido Ainu and Sakhalin Ainu. For details, please refer to the paper (see below).

Citation

When using the model please cite the following paper (in press):

@article{nowakowski2022,
  title={Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining},
  author={Nowakowski, Karol and Ptaszynski, Michal and Murasaki, Kyoko and Nieuważny, Jagna},
  year={2022},
  journal={Information Processing & Management}
}