mrrubino
/

wav2vec2-large-xlsr-53-l2-arctic-phoneme

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

wav2vec2-large-xlsr-53-l2-arctic-phoneme / README.md

mrrubino's picture

Create README.md

57da15c 11 months ago

|

1.09 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- cer
	- wer
	library_name: transformers
	pipeline_tag: automatic-speech-recognition
	---

	# Model
	This model is [Wav2Vec2-Large-XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
	fine-tuned on the manually annotated subset of
	CMU's [L2-Arctic dataset](https://psi.engr.tamu.edu/l2-arctic-corpus/). It was fine-tuned
	to perform automatic phonetic transcriptions in IPA.
	It was tuned following a similar procedure as described
	by [vitouphy](https://huggingface.co/vitouphy/wav2vec2-xls-r-300m-timit-phoneme)
	with the TIMIT dataset.

	# Usage
	To use the model, create a pipeline and invoke it with
	the path to your WAV, which must be sampled at 16KHz.

	```python
	from transformers import pipeline

	pipe = pipeline(model="mrrubino/wav2vec2-large-xlsr-53-l2-arctic-phoneme")
	transcription = pipe("file.wav")["text"]
	```

	# Results
	The manually annotated subset of L2-Arctic was divided
	into training and testing datasets with a 90/10 split.
	The performance metrics for the testing dataset are
	included below.

	WER - 0.425
	CER - 0.128