mrrubino
/

wav2vec2-large-xlsr-53-l2-arctic-phoneme

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

mrrubino commited on Jan 9

Commit

57da15c

•

1 Parent(s): 1529525

Create README.md

Files changed (1) hide show

README.md +39 -0

README.md ADDED Viewed

	@@ -0,0 +1,39 @@

+---
+license: apache-2.0
+language:
+- en
+metrics:
+- cer
+- wer
+library_name: transformers
+pipeline_tag: automatic-speech-recognition
+---
+# Model
+This model is [Wav2Vec2-Large-XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
+fine-tuned on the manually annotated subset of
+CMU's [L2-Arctic dataset](https://psi.engr.tamu.edu/l2-arctic-corpus/). It was fine-tuned
+to perform automatic phonetic transcriptions in IPA.
+It was tuned following a similar procedure as described
+by [vitouphy](https://huggingface.co/vitouphy/wav2vec2-xls-r-300m-timit-phoneme)
+with the TIMIT dataset.
+# Usage
+To use the model, create a pipeline and invoke it with
+the path to your WAV, which must be sampled at 16KHz.
+```python
+from transformers import pipeline
+pipe = pipeline(model="mrrubino/wav2vec2-large-xlsr-53-l2-arctic-phoneme")
+transcription = pipe("file.wav")["text"]
+```
+# Results
+The manually annotated subset of L2-Arctic was divided
+into training and testing datasets with a 90/10 split.
+The performance metrics for the testing dataset are
+included below.
+WER - 0.425
+CER - 0.128