|
# Pytorch to Safetensor Converter |
|
|
|
--- |
|
|
|
|
|
|
|
A simple converter which converts pytorch .bin tensor files (Usually listed as "pytorch_model.bin" or "pytorch_model-xxxx-of-xxxx.bin") to safetensor files. Reason? |
|
|
|
~~because it's cool!~~ |
|
|
|
Because the safetensor format decreases the loading time of large LLM models, currently supported in [oobabooga's text-generation-webui](https://github.com/oobabooga/text-generation-webui). It also supports in-place loading, which effectively decreased the required memory to load a LLM. |
|
|
|
Note: Most of the code originated from [Convert to Safetensors - a Hugging Face Space by safetensors](https://huggingface.co/spaces/safetensors/convert), and this code cannot deal with files that are not named as "pytorch_model.bin" or "pytorch_model-xxxx-of-xxxx.bin". |
|
|
|
### Limitations: |
|
|
|
The program requires **A lot** of memory. To be specific, your idle memory should be **at least** twice the size of your largest ".bin" file. Or else, the program will run out of memory and use your swap... that would be **slow!** |
|
|
|
This program **will not** re-shard (aka break down) the model, you'll need to do it yourself using some other tools. |
|
|
|
### Usage: |
|
|
|
After installing python (Python 3.10.x is suggested), ``cd`` into the repository and install dependencies first: |
|
|
|
``` |
|
git clone https://github.com/Silver267/pytorch-to-safetensor-converter.git |
|
cd pytorch-to-safetensor-converter |
|
pip install -r requirements.txt |
|
``` |
|
|
|
Copy **all content** of your model's folder into this repository, then run: |
|
|
|
``` |
|
python convert_to_safetensor.py |
|
``` |
|
Follow the instruction in the program. Remember to use the **full path** for the model directory (Something like ``E:\models\xxx-fp16`` that contains all the model files). Wait for a while, and you're good to go. The program will automatically copy all other files to your destination folder, enjoy! |
|
|
|
### Precision stuff |
|
if your original model is fp32 then don't forget to edit ``"torch_dtype": "float32",`` to ``"torch_dtype": "float16",`` in ``config.json`` |
|
#### Note that this operation might (in rare occasions) cause the LLM to output NaN while performing operations since it decreases the precision to fp16. |
|
If you're worried about that, simply edit the line ``loaded = {k: v.contiguous().half() for k, v in loaded.items()}`` in ``convert_to_safetensor.py`` into ``loaded = {k: v.contiguous() for k, v in loaded.items()}`` and you'll have a full precision model. |
|
|