wang7776
/

vicuna-7b-v1.3-attention-sparsity-20

Text Generation

text-generation-inference

Model card Files Files and versions Community

wang7776 commited on Feb 5

Commit

6ca94ac

•

1 Parent(s): 9ffb1e2

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -2,6 +2,10 @@
 inference: false
 license: apache-2.0
 ---
 # Vicuna Model Card
 ## Model Details

 inference: false
 license: apache-2.0
 ---
+# Overview
+This model has been pruned to 20% sparsity using the [Wanda pruning method](https://arxiv.org/abs/2306.11695) on attention layers. This method requires no retraining or weight updates and still achieves competitive performance. A link to the base model can be found [here](https://huggingface.co/lmsys/vicuna-7b-v1.3).
 # Vicuna Model Card
 ## Model Details