echarlaix's picture
echarlaix HF staff
Update model card
66e4329
metadata
language: en
license: apache-2.0
datasets:
  - sst2
  - glue
metrics:
  - accuracy
tags:
  - text-classification
  - neural-compressor
  - int8

Dynamically quantized and pruned DistilBERT base uncased finetuned SST-2

Table of Contents

Model Details

Model Description: This model is a DistilBERT fine-tuned on SST-2 dynamically quantized and pruned using a magnitude pruning strategy to obtain a sparsity of 10% with optimum-intel through the usage of Intel® Neural Compressor.

  • Model Type: Text Classification
  • Language(s): English
  • License: Apache-2.0
  • Parent Model: For more details on the original model, we encourage users to check out this model card.

How to Get Started With the Model

To load the quantized model and run inference using the Transformers pipelines, you can do as follows:

from transformers import AutoTokenizer, pipeline
from optimum.intel.neural_compressor import IncQuantizedModelForSequenceClassification

model_id = "echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1"
model = IncQuantizedModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
cls_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "He's a dreadful magician."
outputs = cls_pipe(text)