mrrubino commited on
Commit
57da15c
1 Parent(s): 1529525

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - cer
7
+ - wer
8
+ library_name: transformers
9
+ pipeline_tag: automatic-speech-recognition
10
+ ---
11
+
12
+ # Model
13
+ This model is [Wav2Vec2-Large-XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
14
+ fine-tuned on the manually annotated subset of
15
+ CMU's [L2-Arctic dataset](https://psi.engr.tamu.edu/l2-arctic-corpus/). It was fine-tuned
16
+ to perform automatic phonetic transcriptions in IPA.
17
+ It was tuned following a similar procedure as described
18
+ by [vitouphy](https://huggingface.co/vitouphy/wav2vec2-xls-r-300m-timit-phoneme)
19
+ with the TIMIT dataset.
20
+
21
+ # Usage
22
+ To use the model, create a pipeline and invoke it with
23
+ the path to your WAV, which must be sampled at 16KHz.
24
+
25
+ ```python
26
+ from transformers import pipeline
27
+
28
+ pipe = pipeline(model="mrrubino/wav2vec2-large-xlsr-53-l2-arctic-phoneme")
29
+ transcription = pipe("file.wav")["text"]
30
+ ```
31
+
32
+ # Results
33
+ The manually annotated subset of L2-Arctic was divided
34
+ into training and testing datasets with a 90/10 split.
35
+ The performance metrics for the testing dataset are
36
+ included below.
37
+
38
+ WER - 0.425
39
+ CER - 0.128