How can you convert the pth into the gguf model?
Hi, thanks for your awesome work!
I want to convert more comprehensive quantization varients for the original model, but failed to find a way to deal with the pth file format. What's worse, the convert_rwkv_checkpoint_to_hf.py
script provided by transformers
also complained this:
Traceback (most recent call last):
File "/home/mzwing/AI/runner/tools/convert_rwkv_checkpoint_to_hf.py", line 201, in <module>
convert_rmkv_checkpoint_to_hf_format(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
args.repo_id,
^^^^^^^^^^^^^
...<5 lines>...
model_name=args.model_name,
^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/mzwing/AI/runner/tools/convert_rwkv_checkpoint_to_hf.py", line 151, in convert_rmkv_checkpoint_to_hf_format
torch.save({k: v.cpu().clone() for k, v in state_dict.items()}, os.path.join(output_dir, shard_file))
^^^^^^^^^^^^^^^^
AttributeError: 'Tensor' object has no attribute 'items'. Did you mean: 'item'?
If I ignore the error and continue converting it to gguf, llama.cpp
's convert_hf_to_gguf.py
will throw this:
Traceback (most recent call last):
File "/home/mzwing/AI/repos/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
main()
~~~~^^
File "/home/mzwing/AI/repos/llama.cpp/./convert_hf_to_gguf.py", line 5112, in main
model_architecture = hparams["architectures"][0]
~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'architectures'
So, how can you convert the pth into the gguf model? Could you please help me? Thanks a lot!
Sorry for the late reply😥. You need to use the pth_to_hf.py file to convert the pth file to hf format, and then convert the hf file to gguf. Below is the content of pth_to_hf.py❤
# Convert the model for the pytoch_model.bin
import torch
SOURCE_MODEL="./v6-FinchX-14B-pth/rwkv-14b-final.pth"
TARGET_MODEL="./v6-Finch-14B-HF/pytorch_model.bin"
# delete target model
import os
if os.path.exists(TARGET_MODEL):
os.remove(TARGET_MODEL)
model = torch.load(SOURCE_MODEL, mmap=True, map_location='cpu')
# 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
# Rename all the keys, to include "rwkv."
new_model = {}
for key in model.keys():
# If the keys start with "blocks"
if key.startswith("blocks."):
new_key = "rwkv." + key
# Replace .att. with .attention.
new_key = new_key.replace(".att.", ".attention.")
# Replace .ffn. with .feed_forward.
new_key = new_key.replace(".ffn.", ".feed_forward.")
# Replace `0.ln0.` with `0.pre_ln.`
new_key = new_key.replace("0.ln0.", "0.pre_ln.")
else:
# No rename needed
new_key = key
# Rename `emb.weight` to `rwkv.embeddings.weight`
if key == "emb.weight":
new_key = "rwkv.embeddings.weight"
# Rename the `ln_out.x` to `rwkv.ln_out.x
if key.startswith("ln_out."):
new_key = "rwkv." + key
print("Renaming key:", key, "--to-->", new_key)
new_model[new_key] = model[key]
# Save the new model
print("Saving the new model to:", TARGET_MODEL)
torch.save(new_model, TARGET_MODEL)
Thanks for your reply!
However, if I use this script to convert the pth file, the llama.cpp
's convert_hf_to_gguf.py
will complain that:
INFO:hf-to-gguf:Loading model: RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF
Traceback (most recent call last):
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
main()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5108, in main
hparams = Model.load_hparams(dir_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 468, in load_hparams
with open(dir_model / "config.json", "r", encoding="utf-8") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '../RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF/config.json'
BTW, does this script come from https://rwkv.cn/llamacpp#appendix-code? I believe it does convert the model into HF format, but it forgets to save the model info to config.json, etc.
The required config.json and other files are in this URL: https://huggingface.co/RWKV/rwkv-6-world-3b
yes! This pyScript comes from https://rwkv.cn/llamacpp#appendix-code
Oh thx a lot! I will have a try later. ❤️
This time when I run the convert_hf_to_gguf.py
I encountered a new traceback 😢:
> python ./convert_hf_to_gguf.py --outtype f16 --outfile ../RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF.F16.gguf ../RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF/
INFO:hf-to-gguf:Loading model: RWKV6-3B-Chn-UnlimitedRP-mini-chat-HF
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'pytorch_model.bin'
INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> F16, shape = {2560, 65536}
............
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
main()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main
model_instance.write()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write
self.prepare_metadata(vocab_only=False)
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata
self.set_vocab()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3330, in set_vocab
assert (self.dir_model / "rwkv_vocab_v20230424.txt").is_file()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
What can I do next?
OK I think I find the rwkv_vocab_v20230424.txt
is actually here: https://huggingface.co/RWKV/v6-Finch-1B6-HF/blob/main/rwkv_vocab_v20230424.txt
However, I found that I still cannot convert it successfully. It now complains a new traceback:
Traceback (most recent call last):
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module>
main()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main
model_instance.write()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write
self.prepare_metadata(vocab_only=False)
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata
self.set_vocab()
File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3340, in set_vocab
assert len(parts) >= 3
^^^^^^^^^^^^^^^
AssertionError
I found a GitHub repo https://github.com/BBuf/RWKV-World-HF-Tokenizer can help convert!
However, it still requires some manual editing (see its README.md for more details).
I think I will fork it to make some improvements.
BTW, if want to use the repo, maybe we should also set transformers==4.46.3
, or the script will just refuse to work 😢...
And the file I mentioned here is still needed to be put in the HF folder.
(Looks like a bit complicated...)
OK I think I find the
rwkv_vocab_v20230424.txt
is actually here: https://huggingface.co/RWKV/v6-Finch-1B6-HF/blob/main/rwkv_vocab_v20230424.txtHowever, I found that I still cannot convert it successfully. It now complains a new traceback:
Traceback (most recent call last): File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5140, in <module> main() File "/content/llama.cpp/./convert_hf_to_gguf.py", line 5134, in main model_instance.write() File "/content/llama.cpp/./convert_hf_to_gguf.py", line 440, in write self.prepare_metadata(vocab_only=False) File "/content/llama.cpp/./convert_hf_to_gguf.py", line 433, in prepare_metadata self.set_vocab() File "/content/llama.cpp/./convert_hf_to_gguf.py", line 3340, in set_vocab assert len(parts) >= 3 ^^^^^^^^^^^^^^^ AssertionError
You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF
I found a GitHub repo https://github.com/BBuf/RWKV-World-HF-Tokenizer can help convert!
However, it still requires some manual editing (see its README.md for more details).
I think I will fork it to make some improvements.
BTW, if want to use the repo, maybe we should also set
transformers==4.46.3
, or the script will just refuse to work 😢...And the file I mentioned here is still needed to be put in the HF folder.
(Looks like a bit complicated...)
Since I don't have LLM related expertise, I don't quite understand how it works, sorry
You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF
It worked!
Looks like both of the converting results are the same!
Thanks for your patient reply!!!!!
My result: https://huggingface.co/mzwing/RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF/tree/main
You can try using the files in this repo 🤔: https://huggingface.co/RWKV/v6-Finch-3B-HF
It worked!
Looks like both of the converting results are the same!
Thanks for your patient reply!!!!!
My result: https://huggingface.co/mzwing/RWKV6-3B-Chn-UnlimitedRP-mini-chat-GGUF/tree/main
😊I'm glad I solved your problem.