Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
alpindale
/
Mistral-7B-Instruct-v0.2-AQLM-2Bit-1x16
like
2
Text Generation
Transformers
PyTorch
mistral
conversational
text-generation-inference
Inference Endpoints
aqlm
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
alpindale
commited on
Mar 14
Commit
86407f7
•
1 Parent(s):
9fcfbc0
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+2
-0
README.md
ADDED
Viewed
@@ -0,0 +1,2 @@
1
+
Took 42 hours to quantize on 4xA40s, at a batch size of 128. I could've went higher, but hindsight.
2
+
At that batch size, it was using about 25-30 GiB per GPU, utilization remained at 100%.