Pytorch int8 quantized version of gpt2-large

Usage

Download the .bin file locally. Load with:

Rest of the usage according to original instructions.

import torch

model = torch.load("path/to/pytorch_model_quantized.bin")

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support