|
--- |
|
tags: |
|
- text-generation-inference |
|
- whisper |
|
- audio |
|
base_model: |
|
- openai/whisper-large-v3 |
|
--- |
|
|
|
|
|
# Whisper Large v3 with Key-Value-Cache enabled in ONNX fp16 format |
|
- Model creator: [Open AI](https://huggingface.co/openai) |
|
- Original model: [Whisper Large v3](https://huggingface.co/openai/whisper-large-v3) |
|
|
|
<!-- description start --> |
|
## Description |
|
|
|
This repo contains the ONNX files for the ONNX conversion of Whisper Large v3 done by Esperanto Technologies. |
|
The model is in the fp16 format and has the KVC enabled. |
|
|
|
<!-- description end --> |
|
|
|
## How to download ONNX model and weight files |
|
|
|
The easiest way to obtain the model is to clone this whole repo. |
|
Alternatively you can download the files is using the `huggingface-hub` Python library. |
|
|
|
```shell |
|
pip3 install huggingface-hub>=0.17.1 |
|
``` |
|
|
|
Then you can download any individual model file to the current directory, at high speed, with a command like this: |
|
|
|
```shell |
|
huggingface-cli download Esperanto/whisper-large-v3-kvc-fp16-onnx --local-dir whisper-large-v3-kvc-fp16-onnx --local-dir-use-symlinks False |
|
``` |
|
|
|
For more documentation on downloading with `huggingface-cli`, please see: [HF -> Hub Python Library -> Download files -> Download from the CLI](https://huggingface.co/docs/huggingface_hub/guides/download#download-from-the-cli). |
|
|
|
## How to run from Python code using ONNXRuntime |
|
|
|
This model can easily be ran in a CPU using [ONNXRuntime](https://onnxruntime.ai/). |
|
|
|
Scripts about how to run these models will be provided soon. |