This is a d-Matrix functional reference of the opt model family, with the following revisions:
The reference provides the following functional configurations:
Configuration | Explanation |
---|---|
BASELINE |
a reference functionally equivalent to the original model |
BASIC |
all linear algebraic operands quantized to BFP16-64 , and all other operations transformed to approximated kernel simulations |
Usage
Install d-Matrix Dmx_Compressor first.
pip install dmx_compressor
The following is an example model and its evaluation.
from dmx.compressor.dmx import pipeline
pipe = pipeline(
task="text-generation",
model="d-matrix/opt",
revision="opt-125m", # see above for other variants
dmx_config="BASELINE", # see above for other variants
)
results = pipe.evaluate(
metric="d-matrix/dmx_perplexity",
dataset="wikitext",
dataset_version="wikitext-2-raw-v1",
)
Evaluation results
perplexity
onpenn_treebank
Revision \ Configuration BASELINE
BASIC
opt-125m
29.496986389160156 29.628690719604492 opt-350m
23.57796859741211 23.683700561523438 opt-1.3b
15.616923332214355 15.879881858825684 opt-2.7b
13.993170738220215 14.005770683288574 opt-6.7b
12.166489601135254 12.196784019470215 perplexity
onwikitext2
Revision \ Configuration BASELINE
BASIC
opt-125m
27.661212921142578 27.786727905273438 opt-350m
22.00566291809082 22.00930404663086 opt-1.3b
14.624724388122559 14.811502456665039 opt-2.7b
12.468732833862305 12.504587173461914 opt-6.7b
10.856857299804688 10.841047286987305