SLPL
/

Sharif-wav2vec2

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

sadrasabouri commited on Sep 18, 2022

Commit

64fb33b

·

1 Parent(s): ae2afd2

Update README.md

Files changed (1) hide show

README.md +8 -9

README.md CHANGED Viewed

@@ -32,18 +32,18 @@ model-index:
 # Sharif-wav2vec2
-This is the fine-tuned version of Sharif Wav2vec2 for Farsi. The base model was fine-tuned on 108 hours of Commonvoice's Farsi samples with a sampling rate equal to 16kHz. Afterward, we trained a 5gram using [kenlm](https://github.com/kpu/kenlm) toolkit and used it in the processor which increased our accuracy on online ASR.
 ## Usage
-When using the model make sure that your speech input is sampled at 16Khz. Prior to the usage, you may need to install the below dependencies:
 ```shell
 pip install pyctcdecode
 pip install pypi-kenlm
 ```
-For testing you can use the hosted inference API at the hugging face (There are provided examples from common voice) it may take a while to transcribe the given voice. Or you can use the bellow code for a local run:
 ```python
 import tensorflow
@@ -76,13 +76,12 @@ print(prediction[0])
 ```
 ## Evaluation
-For the evaluation use the code below:
-to evaluate your own dataset you should load corresponding csv file
-input csv files format is made clear below:
-| path| reference|
-|---|---|
-| path to audio files | corresponding transcription|
 ```python
 import torch

 # Sharif-wav2vec2
+This is a fine-tuned version of Sharif Wav2vec2 for Farsi. The base model went through a fine-tuning process in which 108 hours of Commonvoice's Farsi samples with a sampling rate equal to 16kHz. Afterward, we trained a 5gram using [kenlm](https://github.com/kpu/kenlm) toolkit and used it in the processor which increased our accuracy on online ASR.
 ## Usage
+When using the model, ensure that your speech input is sampled at 16Khz. Prior to the usage, you may need to install the below dependencies:
 ```shell
 pip install pyctcdecode
 pip install pypi-kenlm
 ```
+For testing you can use the hosted inference API at the hugging face (There are provided examples from common-voice) it may take a while to transcribe the given voice. Or you can use the bellow code for a local run:
 ```python
 import tensorflow
 ```
 ## Evaluation
+For the evaluation, you can use the code below. Ensure your dataset to be in following form in order to avoid any further conflict:
+| path | reference|
+|:----:|:--------:|
+| path/to/audio_file.wav | "TRANSCRIPTION" |
 ```python
 import torch