No support for float16 on CPU?

by hwasiti - opened Nov 30, 2022

Nov 30, 2022

Tried this model on CPU only with float16 and the following code gave me this error:
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

from transformers import AutoTokenizer, OPTForCausalLM

tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-6.7b")
model = OPTForCausalLM.from_pretrained("facebook/galactica-6.7b", torch_dtype=torch.float16)

input_text = """
# The benefits of deadlifting

## INTRODUCTION
"""

randomizer_value = 0
repititions = 1


# set seed to reproduce results. Feel free to change the seed though to get different results
torch.manual_seed(randomizer_value)

input_ids = tokenizer(input_text, return_tensors="pt").input_ids 

# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
sample_outputs = model.generate(
    input_ids,
    do_sample=True, 
    max_length=2000, 
    top_k=50, 
    top_p=0.95, 
    num_return_sequences=1
)

hwasiti changed discussion title from No support for float16? to No support for float16 on CPU? Dec 7, 2022

hwasiti

Dec 7, 2022

•

edited Jan 5, 2023

I think this post answers my question:
https://twitter.com/pytorch/status/1450502321838960641?lang=en

FP16 is only supported in CUDA, BF16 has support on newer CPUs and TPUs
Calling .half() on your network and tensors explicitly casts them to FP16, but not all ops are safe to run in half-precision.

hwasiti

Dec 7, 2022

For CPU BF16 is supported:

1.10 onwards, PyTorch has a generic API `torch. autocast()` that automatically casts

*  CUDA tensors to FP16, and
*  CPU tensors to BF16.

source: https://twitter.com/PyTorch/status/1450502326834368516

Now the question is can we use BF16 instead of FP16?

hwasiti

Dec 7, 2022

•

edited Dec 7, 2022

Again answering myself:
We’re empowering PyTorch 1.12 on the 3rd gen @Intel
Xeon® Scalable processor (codename Cooper Lake). It’s the first general purpose x86 CPU with native bfloat16 support, showing a 1.4x to 2.2x performance gain over float32 on the TorchVision models

source:
https://twitter.com/pytorch/status/1559611043273375746?lang=en

Conclusion:
Only recent XEON processors support BFLOAT16 natively. (Cooperlake introduced in June 2020)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment