Update constants.py
Browse files- constants.py +22 -27
constants.py
CHANGED
@@ -17,6 +17,28 @@ We report two key performance metrics: [Word Error Rate (WER)](https://huggingfa
|
|
17 |
The leaderboard primarily ranks models based on WER, from lowest to highest. You can refer to the π **Metrics** tab for a detailed explanation of how these models are evaluated.
|
18 |
If there is a model you'd like to see ranked but is not listed here, you may submit a request for evaluation by following the instructions in the "Request a Model" tab βοΈβ¨.
|
19 |
This leaderboard is intended to provide a comparative analysis of Persian ASR models based on their ability to recognize speech in various Persian dialects and settings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
"""
|
21 |
|
22 |
CITATION_TEXT = """@misc{persian-asr-leaderboard,
|
@@ -50,31 +72,4 @@ To request that a model be included on this leaderboard, please submit its name
|
|
50 |
Simply navigate to the "Request a Model" tab, enter the details, and your model will be evaluated at the next available opportunity.
|
51 |
"""
|
52 |
|
53 |
-
# Adding a brief description for each model listed on the leaderboard
|
54 |
-
MODEL_LIST_TEXT = """
|
55 |
-
# Persian ASR Model Rankings
|
56 |
-
|
57 |
-
Below is a list of models currently ranked on the Persian ASR Leaderboard. Each model has been evaluated across multiple Persian speech datasets to provide an accurate comparison based on their performance in recognizing Persian speech.
|
58 |
-
|
59 |
-
1. **navidved/Goya-v1**
|
60 |
-
A high-performing ASR model with particularly strong results on the Persian ASR Test Set. It shows a low WER and CER across various datasets, making it one of the top choices for Persian speech recognition.
|
61 |
-
|
62 |
-
2. **openai/whisper-large-v3**
|
63 |
-
This model performs reasonably well on the ASR Farsi YouTube dataset, though it struggles more on the Persian ASR Test Set, indicating that it may be better suited for more casual or non-technical speech environments.
|
64 |
-
|
65 |
-
3. **ghofrani/xls-r-1b-fa-cv8**
|
66 |
-
With balanced performance across all datasets, this model offers decent accuracy for both word and character recognition but faces challenges on more controlled datasets like the Persian ASR Test Set.
|
67 |
-
|
68 |
-
4. **jonatasgrosman/wav2vec2-large-xlsr-53-persian**
|
69 |
-
A reliable ASR model that performs well on the Common Voice dataset but sees reduced accuracy in the more challenging Persian ASR Test Set and YouTube data. Suitable for more common conversational speech.
|
70 |
-
|
71 |
-
5. **m3hrdadfi/wav2vec2-large-xlsr-persian-shemo**
|
72 |
-
This model is better suited for informal contexts, with higher WER and CER values across all datasets. It may struggle in more complex or structured speech recognition tasks.
|
73 |
-
|
74 |
-
6. **openai/whisper-large-v2**
|
75 |
-
With the highest WER and CER across all datasets, this model underperforms in Persian speech recognition tasks, particularly on more difficult datasets like the Persian ASR Test Set.
|
76 |
-
|
77 |
-
To see how these models compare in detail, refer to the WER and CER metrics, and explore their performance on the diverse datasets listed in the "Metrics" tab.
|
78 |
-
"""
|
79 |
-
|
80 |
print("Content prepared for the Persian ASR leaderboard.")
|
|
|
17 |
The leaderboard primarily ranks models based on WER, from lowest to highest. You can refer to the π **Metrics** tab for a detailed explanation of how these models are evaluated.
|
18 |
If there is a model you'd like to see ranked but is not listed here, you may submit a request for evaluation by following the instructions in the "Request a Model" tab βοΈβ¨.
|
19 |
This leaderboard is intended to provide a comparative analysis of Persian ASR models based on their ability to recognize speech in various Persian dialects and settings.
|
20 |
+
|
21 |
+
## Persian ASR Model Rankings
|
22 |
+
Below is a list of models currently ranked on the Persian ASR Leaderboard. Each model has been evaluated across multiple Persian speech datasets to provide an accurate comparison based on their performance in recognizing Persian speech.
|
23 |
+
|
24 |
+
1. **navidved/Goya-v1**
|
25 |
+
A high-performing ASR model with particularly strong results on the Persian ASR Test Set. It shows a low WER and CER across various datasets, making it one of the top choices for Persian speech recognition.
|
26 |
+
|
27 |
+
2. **openai/whisper-large-v3**
|
28 |
+
This model performs reasonably well on the ASR Farsi YouTube dataset, though it struggles more on the Persian ASR Test Set, indicating that it may be better suited for more casual or non-technical speech environments.
|
29 |
+
|
30 |
+
3. **ghofrani/xls-r-1b-fa-cv8**
|
31 |
+
With balanced performance across all datasets, this model offers decent accuracy for both word and character recognition but faces challenges on more controlled datasets like the Persian ASR Test Set.
|
32 |
+
|
33 |
+
4. **jonatasgrosman/wav2vec2-large-xlsr-53-persian**
|
34 |
+
A reliable ASR model that performs well on the Common Voice dataset but sees reduced accuracy in the more challenging Persian ASR Test Set and YouTube data. Suitable for more common conversational speech.
|
35 |
+
|
36 |
+
5. **m3hrdadfi/wav2vec2-large-xlsr-persian-shemo**
|
37 |
+
This model is better suited for informal contexts, with higher WER and CER values across all datasets. It may struggle in more complex or structured speech recognition tasks.
|
38 |
+
|
39 |
+
6. **openai/whisper-large-v2**
|
40 |
+
With the highest WER and CER across all datasets, this model underperforms in Persian speech recognition tasks, particularly on more difficult datasets like the Persian ASR Test Set.
|
41 |
+
|
42 |
"""
|
43 |
|
44 |
CITATION_TEXT = """@misc{persian-asr-leaderboard,
|
|
|
72 |
Simply navigate to the "Request a Model" tab, enter the details, and your model will be evaluated at the next available opportunity.
|
73 |
"""
|
74 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
75 |
print("Content prepared for the Persian ASR leaderboard.")
|