config.json, tokenizer.json etc. files for onnx?

#1
by YuryKonvpalto - opened

Hi Guys, - where are you hiding config.json, tokenizer.json etc. files for onnx?

Sorry, I cannot understand you.

What are those files?

We don't have such files in Next-gen Kaldi.

Everything is open-sourced in Next-gen Kaldi. We would like to share everything we can share.

I mean - when I try to load vosk through onnx and transformers like that:

import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.0";
const model = await pipeline('automatic-speech-recognition', 'alphacep/vosk-model-small-ru');

it throws in browser console an error:
https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/preprocessor_config.json 404 (Not Found)
https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/config.json 404 (Not Found)
https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/tokenizer_config.json 404 (Not Found)
https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/tokenizer.json 404 (Not Found)

For example, Whisper ('Xenova/whisper-small') works with the same code fine. And if you check their HF-repo "Files and versions" - they have over there all these files - config,tokenizer etc.
It seems that in order to work with the onnx/transormers.js pipeline you have to have this files in repo.

Maybe I miss her something? I'm not very familiar with that pipeline, but as analogue - Whisper works fine having all these files...

I mean - when I try to load vosk through onnx and transformers like that:

The model is not designed to be usable in transformers.

Please use
http://github.com/k2-fsa/sherpa-onnx

The model can be only used in sherpa-onnx, not elsewhere.

You can find its usage
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/blob/main/model.py#L434


Could you tell us where you find instructions saying that you can use the model with transformers?

Ah so, got it. Thanx guys a lot for an explanation and links!
Would be greate anyway if one day Vosk could be reached via transformers.js too!:)

I am afraid there is no such a plan to support it.

Our target is for C++ deployment.

csukuangfj changed discussion status to closed

Sign up or log in to comment