Convert checkpoint files to float16

#6
by mkardas - opened
No description provided.
mkardas changed pull request status to open

How can I implement this?

What are you trying to achieve?

The 1.3b model uses most of my 8gb of vram so large requests make it go over pretty quickly, I was hoping this would cut the memory use down.

You can load your model with:

model = OPTForCausalLM.from_pretrained(
            "facebook/galactica-1.3b",
            torch_dtype="float16",
            device_map="auto"
        )
mkardas changed pull request status to merged

Sign up or log in to comment