meisin123
/

whisper-small-iban

@@ -21,7 +21,7 @@ It achieves the following results on the evaluation set:
 ## How to Get Started with the Model
-Use the code below to use the model in Inference Mode.
 ```
 from transformers import pipeline
@@ -36,117 +36,22 @@ audio_file = "audio.mp3"   ## use your own audio here
 transcribed_text = pipe(audio_file, batch_size = 16)
 ```
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 ## How to Get Started with the Model
+Use the code below to use the model in **Inference Mode**.
 ```
 from transformers import pipeline
 transcribed_text = pipe(audio_file, batch_size = 16)
 ```
 ## Training Details
 ### Training Data
+The model is trained on the Iban Speech Corpus. The dataset is available on Huggingface, more information [here](https://huggingface.co/datasets/meisin123/iban_speech_corpus).
+Iban is one of the under-resourced languages. The Iban language (jaku Iban) is spoken by the Iban, one of the Dayak ethnic groups, who live in Brunei, the Indonesian province of West Kalimantan and in the Malaysian state of Sarawak. It belongs to the Malayic subgroup, a Malayo-Polynesian branch of the Austronesian language family.
 ## Evaluation
+### Performance and Limitations
+There are still a lot of room for improvement for this Iban ASR model.
+1. The accuracy of the model can be further improved with more training data. As Iban is an under-resourced languages, there are limited audio data to train on.
+2. Currently, the model is not able to handle code-switched speech. If the audio contains a combination of English and Iban, the model does poorly on the English portion.
 ## Model Card Contact
+For more information, please contact the author at meisin123@gmail.com