Converting from mistral-finetune to hf compatible
#79
by
azimjon
- opened
I couldn't find a script to convert from the result of mistral-finetune library to transformers library compatible weights. I would appreciate if someone shares it.
Example calling it from Python:
Converting to hf
convert_command = ['python3', 'transformers/src/transformers/models/mistral/convert_mistral_weights_to_hf.py', '--input_dir', model_save_path, '--model_size', '13B', '--is_v3', '--output_dir', f'{model_save_path}-hf']
with open(f"{Path.home()}/convert.log", "w") as f:
hf_convert_result = subprocess.Popen(convert_command, stdout=subprocess.PIPE,
cwd=f'{Path.home()}/')
for c in iter(lambda: hf_convert_result.stdout.read(1), b""):
sys.stdout.buffer.write(c)
f.buffer.write(c)
Gentle ping on this. Converting Nemo model fails with the following
Traceback (most recent call last):
File "/home/ec2-user/workspace/transformers/src/transformers/models/mistral/convert_mistral_weights_to_hf.py", line 276, in <module>
main()
File "/home/ec2-user/workspace/transformers/src/transformers/models/mistral/convert_mistral_weights_to_hf.py", line 271, in main
convert_and_write_model(args.input_dir, args.output_dir, args.max_position_embeddings, args.modules_are_split)
File "/home/ec2-user/workspace/transformers/src/transformers/models/mistral/convert_mistral_weights_to_hf.py", line 217, in convert_and_write_model
new_dict = convert_state_dict(original_state_dict, config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ec2-user/workspace/transformers/src/transformers/models/mistral/convert_mistral_weights_to_hf.py", line 103, in convert_state_dict
tensor = tensor.view(num_key_value_heads, dims_per_head, dim).reshape(key_value_dim, dim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[8, 160, 5120]' is invalid for input of size 5242880
Any guidance on how to fix this?