tomg-group-umd/Gemstone-256x23

#685

by Austinkeith2010 - opened 3 days ago

Discussion

Austinkeith2010

3 days ago

A 50M model should only be at most 50 megabytes. Not 200 or 100 or whatever.

mradermacher

Owner 3 days ago

If this is a roundabout way of asking for quants, I am sorry - the Gemstone models all seem to lack the tokenizer.model model, which is required by llama.cpp to convert to gguf :(

If you can get the creators to add that file, I am willing to try to quantize all of them, of course.

Cheers!

mradermacher changed discussion status to closed 3 days ago

Austinkeith2010

3 days ago

they're based off of the Gemma 2b architecture. try using a gemma 2b (not 2b-it) tokenizer

Austinkeith2010

3 days ago

according to README.md: "Using Gemstone-256x23
The Gemstones are based on the gemma-2b architecture and use modeling_gemma.py to run using the transformers library."

Austinkeith2010

3 days ago

There is a tokenizer.json though!

nicoboss

3 days ago

I just manually tried to convert the model to a source GGUF using latest llama.cpp using python convert_hf_to_gguf.py /root/Gemstone-256x23 --outfile /root/Gemstone-256x23.gguf and I'm getting the following error:

INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 5112, in <module>
    main()
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 5106, in main
    model_instance.write()
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 3227, in set_vocab
    self._set_vocab_sentencepiece()
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 792, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/llama.cpp/convert_hf_to_gguf.py", line 809, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /root/Gemstone-256x23/tokenizer.model

mradermacher

Owner 3 days ago

I haven't looked extremely deeply into this, but for many models, but not all, the tokenizer.json doesn't seem enough. There might actually be a way to convert one into another (I think when it is a old tokenizer vs. fasttokenizer issue), but right now, llama.cpp insists on the file for some model architectures.

Austinkeith2010

2 days ago

they're also based off of the Gemma 2B architecture. try using a gemma 2b (not 2b-it) tokenizer!

mradermacher

Owner 2 days ago

If you (or somebody else) wants to clone it and provide the tokenizer.model, I'll be happy to quantize it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment