Crataco
/

Pythia-Deduped-Series-GGML

Text Generation

Model card Files Files and versions Community

Crataco commited on Sep 28, 2023

Commit

d2882ad

•

1 Parent(s): b3eec42

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -23,6 +23,8 @@ For other versions of the models, see here:
 - [GGMLv2 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-15) (70M to 2.8B)
 - [GGMLv3 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/main) (70M to 2.8B)
 # RAM USAGE
 Model | RAM usage

 - [GGMLv2 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-15) (70M to 2.8B)
 - [GGMLv3 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/main) (70M to 2.8B)
+**Description:**
+- The motivation behind these quantizations was that the LLaMA series lacks sizes below 7B, whereas it was the norm for older models to be available in as little as ~125M parameters. This makes it uncomfortable to run on hardware with less than 4GB of RAM, even with 2-bit quantization.
 # RAM USAGE
 Model | RAM usage