Crataco
/

Pythia-Deduped-Series-GGML

Text Generation

Model card Files Files and versions Community

Pythia Deduped Series GGML

This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.

For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).

Last updated on 2023-05-25.

For other versions of the models, see here:

GGMLv1 q4_3 (70M to 12B)
GGMLv1 q5_0 / q5_1 / q8_0 (70M to 2.8B)
GGMLv1 q4_0 / q4_2 (70M to 2.8B)
GGMLv2 q4_0 / q5_1 (70M to 2.8B)
GGMLv3 q4_0 / q5_1 (70M to 2.8B)

Description:

The motivation behind these quantizations was that the LLaMA series lacks sizes below 7B, whereas it was the norm for older models to be available in as little as ~125M parameters. This makes it uncomfortable to run on hardware with less than 4GB of RAM, even with 2-bit quantization.

RAM USAGE

Model	RAM usage
Unloaded	41.3 MiB

ggmlv3-pythia-70m-deduped-q4_0.bin	95.5 MiB
ggmlv3-pythia-160m-deduped-q4_0.bin	201.1 MiB
ggmlv3-pythia-410m-deduped-q4_0.bin	415.1 MiB
ggmlv3-pythia-1b-deduped-q4_0.bin	762.2 MiB
ggmlv3-pythia-1.4b-deduped-q4_0.bin	1.0 GiB
ggmlv3-pythia-2.8b-deduped-q4_0.bin	1.9 GiB

ggmlv3-pythia-70m-deduped-q5_1.bin	108.7 MiB
ggmlv3-pythia-160m-deduped-q5_1.bin	226.9 MiB
ggmlv3-pythia-410m-deduped-q5_1.bin	494.0 MiB
ggmlv3-pythia-1b-deduped-q5_1.bin	943.9 MiB
ggmlv3-pythia-1.4b-deduped-q5_1.bin	1.3 GiB
ggmlv3-pythia-2.8b-deduped-q5_1.bin	2.3 GiB

Tested on KoboldCpp with OpenBLAS enabled.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train Crataco/Pythia-Deduped-Series-GGML