Text2Text Generation
Transformers
PyTorch
Safetensors
English
t5
Inference Endpoints
text-generation-inference

Upload model.safetensors with huggingface_hub

#3
by jbochi - opened

This PR is similar to https://huggingface.co/grammarly/coedit-large/discussions/4, which was merged in coedit-large.

This new file is equivalent to pytorch_model.bin but safe in the sense that no arbitrary code can be put into it.

These files also happen to load much faster than their pytorch counterpart:
https://colab.research.google.com/github/huggingface/notebooks/blob/main/safetensors_doc/en/speed.ipynb

The model is too large for https://huggingface.co/spaces/safetensors/convert , so I created the file manually:

from transformers import AutoTokenizer, T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-xxl")
model.save_pretrained("coedit-xxl/", safe_serialization=True, max_shard_size="100GB")

To test the safetensors file was correct, I ran this code:

model = T5ForConditionalGeneration.from_pretrained(pretrained_model_name_or_path="./coedit-xxl/")
tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-xxl")
input_text = 'Fix grammatical errors in this sentence: When I grow up, I start to understand what he said is quite right.'
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(edited_text)
# When I grow up, I will start to understand that what he said is quite right

Finally, this PR was created with this code:

from huggingface_hub import HfApi
api = HfApi()
api.upload_file(
  path_or_fileobj="coedit-xxl/model.safetensors",
  path_in_repo="model.safetensors",
  repo_id="grammarly/coedit-xxl",
  create_pr=True)

Thanks for the extensive documentation and for the fix, @jbochi !

machineteacher changed pull request status to merged

Sign up or log in to comment