Intel
/

distilbert-base-uncased-finetuned-sst-2-english-int8-static-inc

Text Classification

text-classfication

neural-compressor

Intel® Neural Compressor

PostTrainingStatic

Inference Endpoints

Model card Files Files and versions Community

xinhe commited on Apr 11, 2022

Commit

a96c305

•

1 Parent(s): d0e482a

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ metrics:
 This is an INT8  PyTorch model quantified with [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
-The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
 The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so
  the real sampling size is 104.
@@ -33,13 +33,14 @@ The calibration dataloader is the train dataloader. The default calibration samp
 | **Accuracy (eval-accuracy)** |0.9037|0.9106|
 | **Model size (MB)**  |65|255|
 ### Load with nlp-toolkit:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
     'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
 )
 ```
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.

 This is an INT8  PyTorch model quantified with [intel/nlp-toolkit](https://github.com/intel/nlp-toolkit) using provider: [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
+The original fp32 model comes from the fine-tuned model [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
 The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so
  the real sampling size is 104.
 | **Accuracy (eval-accuracy)** |0.9037|0.9106|
 | **Model size (MB)**  |65|255|
 ### Load with nlp-toolkit:
 ```python
 from nlp_toolkit import OptimizedModel
 int8_model = OptimizedModel.from_pretrained(
     'Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static',
 )
 ```
 Notes:
  - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.