kotoba-tech
/

kotoba-whisper-v2.0

Automatic Speech Recognition

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

asahi417 commited on Sep 18

Commit

6dd9441

•

1 Parent(s): 7e01ace

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -107,6 +107,7 @@ it inherits the benefit of the improved latency compared to [openai/whisper-larg
 | Model                                                                                        | Params / M | Rel. Latency |
 |----------------------------------------------------------------------------------------------|------------|--------------|
 | **[kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0)**| **756**    | **6.3**      |
 | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                    | 1550       | 1.0          |
@@ -244,6 +245,11 @@ Then pass `attn_implementation="flash_attention_2"` to `from_pretrained`:
 See [https://huggingface.co/distil-whisper/distil-large-v3#model-details](https://huggingface.co/distil-whisper/distil-large-v3#model-details).
 ## Evaluation
 The following code-snippets demonstrates how to evaluate the kotoba-whisper model on the Japanese subset of the CommonVoice 8.0.
 First, we need to install the required packages, including 🤗 Datasets to load the audio data, and 🤗 Evaluate to

 | Model                                                                                        | Params / M | Rel. Latency |
 |----------------------------------------------------------------------------------------------|------------|--------------|
 | **[kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0)**| **756**    | **6.3**      |
+| **[kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0)**| **756**    | **6.3**      |
 | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                    | 1550       | 1.0          |
 See [https://huggingface.co/distil-whisper/distil-large-v3#model-details](https://huggingface.co/distil-whisper/distil-large-v3#model-details).
+## Training
+Please refer to [https://github.com/kotoba-tech/kotoba-whisper](https://github.com/kotoba-tech/kotoba-whisper) for the model training detail.
+Datasets used in distillation and the whole model variations can be found at [https://huggingface.co/japanese-asr](https://huggingface.co/japanese-asr).
 ## Evaluation
 The following code-snippets demonstrates how to evaluate the kotoba-whisper model on the Japanese subset of the CommonVoice 8.0.
 First, we need to install the required packages, including 🤗 Datasets to load the audio data, and 🤗 Evaluate to