Converting to GGUF
#1
by
gardner
- opened
Thank you for creating this model. I am keen to try it out. I attempted to convert it to GGUF for use with llama.cpp but it appears that the tokenizer has been customized, or the reference to the tokenizer is lost somewhere during the conversion process. Can you please confirm:
- Which tokenizer should be used?
- Was it customized at all?
When attempting to convert this model to GGUF for use with llama.cpp the following commands are run:
# Download the model
huggingface-cli download ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1 --local-dir $PWD/models/Bio-Medical-MultiModal-Llama-3-8B-V1 --local-dir-use-symlinks False
# Cache the base model
huggingface-cli download openbmb/MiniCPM-Llama3-V-2_5
# Run conversion
python3 ./examples/llava/minicpmv-surgery.py -m $PWD/models/Bio-Medical-MultiModal-Llama-3-8B-V1
python3 ./examples/llava/minicpmv-convert-image-encoder-to-gguf.py -m $PWD/models/Bio-Medical-MultiModal-Llama-3-8B-V1 --minicpmv-projector $PWD/models/Bio-Medical-MultiModal-Llama-3-8B-V1/minicpmv.projector --output-dir $PWD/models/Bio-Medical-MultiModal-Llama-3-8B-V1/ --image-mean 0.5 0.5 0.5 --image-std 0.5 0.5 0.5
python3 ./convert_hf_to_gguf.py ./models/Bio-Medical-MultiModal-Llama-3-8B-V1/model
which produces the error:
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
The repository for /home/user/src/llama.cpp/models/Bio-Medical-MultiModal-Llama-3-8B-V1/model contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//home/user/src/llama.cpp/models/Bio-Medical-MultiModal-Llama-3-8B-V1/
model.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
Do you wish to run the custom code? [y/N] y
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING:hf-to-gguf:
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: https://github.com/ggerganov/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: 1baddeb572cd9de2a6d36f2ad0c361490bf5447dafca20afbac625e9d37f18a5
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:
Traceback (most recent call last):
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 1469, in set_vocab
self._set_vocab_sentencepiece()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 692, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 709, in _create_vocab_sentencepiece
raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /home/user/src/llama.cpp/models/Bio-Medical-MultiModal-Llama-3-8B-V1/model/tokenizer.model
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 1472, in set_vocab
self._set_vocab_llama_hf()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 784, in _set_vocab_llama_hf
vocab = gguf.LlamaHfVocab(self.dir_model)
File "/home/user/src/llama.cpp/gguf-py/gguf/vocab.py", line 368, in __init__
raise FileNotFoundError('Cannot find Llama BPE tokenizer')
FileNotFoundError: Cannot find Llama BPE tokenizer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 4067, in <module>
main()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 4061, in main
model_instance.write()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 391, in write
self.prepare_metadata(vocab_only=False)
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 384, in prepare_metadata
self.set_vocab()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 1475, in set_vocab
self._set_vocab_gpt2()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 628, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 472, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/home/user/src/llama.cpp/./convert_hf_to_gguf.py", line 619, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
SrikanthChellappa
changed discussion status to
closed
Thanks @gardner . Let me see if we get time to get GGUF created and hosted in HF in next few days