Convert checkpoint files to float16
#6
by
mkardas
- opened
No description provided.
mkardas
changed pull request status to
open
How can I implement this?
What are you trying to achieve?
The 1.3b model uses most of my 8gb of vram so large requests make it go over pretty quickly, I was hoping this would cut the memory use down.
You can load your model with:
model = OPTForCausalLM.from_pretrained(
"facebook/galactica-1.3b",
torch_dtype="float16",
device_map="auto"
)
mkardas
changed pull request status to
merged