Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,5 @@
|
|
|
|
|
|
1 |
How to quantize 70B model so it will fit on 2x4090 GPUs:
|
2 |
|
3 |
I tried EXL2, AutoAWQ, and SqueezeLLM and they all failed for different reasons (issues opened).
|
|
|
1 |
+
AI Model Name: Llama 3 70B "Built with Meta Llama 3" https://llama.meta.com/llama3/license/
|
2 |
+
|
3 |
How to quantize 70B model so it will fit on 2x4090 GPUs:
|
4 |
|
5 |
I tried EXL2, AutoAWQ, and SqueezeLLM and they all failed for different reasons (issues opened).
|