patrickvonplaten
commited on
Commit
•
54e90f0
1
Parent(s):
0ab3fb9
Update README.md
Browse files
README.md
CHANGED
@@ -17,6 +17,10 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
# Wav2vec2-xls-r-phoneme-300m-sv
|
19 |
|
|
|
|
|
|
|
|
|
20 |
This model is a fine-tuned version of [wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the COMMON_VOICE - SV-SE dataset.
|
21 |
|
22 |
It achieves the following results on the evaluation set:
|
|
|
17 |
|
18 |
# Wav2vec2-xls-r-phoneme-300m-sv
|
19 |
|
20 |
+
**Note**: The tokenizer was created from the official Swedish phoneme vocabulary as defined here: https://github.com/microsoft/UniSpeech/blob/main/UniSpeech/examples/unispeech/data/sv/phonesMatches_reduced.json
|
21 |
+
|
22 |
+
One can simply download the file, rename it to `vocab.json` and load a `Wav2Vec2PhonemeCTCTokenizer.from_pretrained("./directory/with/vocab.json/")`.
|
23 |
+
|
24 |
This model is a fine-tuned version of [wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the COMMON_VOICE - SV-SE dataset.
|
25 |
|
26 |
It achieves the following results on the evaluation set:
|