Can not be load in whisper.cpp

#1
by MonolithFoundation - opened

whisper_model_load: unknown tensor 'model.encoder.conv1.weight' in model file
whisper_init_with_params_no_state: failed to load model

why??

BELLE-2 Group // Be Everyone's Large Language model Engine org

Please provide the specific runing scripts

hi, please help to confirm, the scripts used here is simple:

def test_pywhisper():
    from pywhispercpp.model import Model

    # model = Model('base.en', n_threads=6)
    #model = Model("large-v3-turbo", n_threads=6)

    audio_f = "temp/yyxh3_1206/parts_asr/50.9_51.4.wav"
    print(audio_f)
    model = Model('checkpoints/belle_whisper_v3_turbo_ggml/ggml-model.bin', n_threads=6)
    # segments = model.transcribe('data/lei-jun-test.wav')
    # segments = model.transcribe('temp/yyxh3_1206/extracted_audio_clean.wav')
    # segments = model.transcribe('temp/yyxh3_1206/extracted_audio.wav')
    segments = model.transcribe(audio_f, language="zh")
    for segment in segments:
        print(segment.text)

All default models is OK, except belle model load faild.

Any help?

I using pywhispercpp because it's the only workable whipser.cpp binding, it should be exactly same as whisper.cpp, also other default models are inferencable the lib itself should be OK.

fail too

$ ./test_model.sh ~/Downloads/14_14_02.WAV
whisper_init_from_file_with_params_no_state: loading model from '~/Workspace/huggingface/Belle-whisper-large-v3-turbo-zh-ggml/ggml-model.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 3
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_model_load:    Metal total size =   487.23 MB
whisper_model_load: unknown tensor 'model.encoder.conv1.weight' in model file
whisper_init_with_params_no_state: failed to load model
error: failed to initialize whisper context

$ shasum -a 256 ggml-model.bin
fa40644ba8947b91474c6d7c3d760d95693db745205c1feae890f72de5fa1eae  ggml-model.bin

$ cat test_model.sh
asr_engine=~/Workspace/github/whisper.cpp/main
asr_model=~/Workspace/huggingface/Belle-whisper-large-v3-turbo-zh-ggml/ggml-model.bin
init_prompt="转录中文和English内容,补充标点符号"
$asr_engine -m "$asr_model" -l zh -pc -nt -osrt --prompt ${init_prompt} -f "$1"
BELLE-2 Group // Be Everyone's Large Language model Engine org

I am checking the model

BELLE-2 Group // Be Everyone's Large Language model Engine org
edited 18 days ago

The issue is resolved; download the model again

greate, it works now.

$ ./test_model.sh ~/Downloads/14_14_02.WAV
whisper_init_from_file_with_params_no_state: loading model from '~/Workspace/huggingface/Belle-whisper-large-v3-turbo-zh-ggml/ggml-model.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: backends   = 3
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 5 (large v3)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_model_load:    Metal total size =  1623.92 MB
whisper_model_load: model size    = 1623.92 MB
whisper_backend_init_gpu: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M3 Max
...

$ shasum -a 256 Belle-whisper-large-v3-turbo-zh-ggml/ggml-model.bin
2a3bba5bfdb4d4da3d9949a83b405711727ca1941d4d5810895e077eb3cb4d99  Belle-whisper-large-v3-turbo-zh-ggml/ggml-model.bin

Hi, it is the ggml weights too old for newest whisper.cpp to load?

BELLE-2 Group // Be Everyone's Large Language model Engine org

The issue arose due to a mismatch in the conversion scripts. I have addressed this bug, and the corrected scripts are now available at this GitHub repository. You can use these updated scripts for converting models without encountering the previous problems.

Sign up or log in to comment