This repository contains quantized conversions of the AI Dungeon 2 checkpoint, "model_v5".
For use with frontends that support GGML quantized GPT-2 models. This model works best with KoboldCpp's "Adventure" mode.
Last updated on 2023-09-23.
Model |
RAM usage (KoboldCpp) |
RAM usage (Oobabooga) |
aid2classic-ggml-q4_0.bin |
984.1 MiB |
1.4 GiB |
aid2classic-ggml-q4_1.bin |
1.1 GiB |
1.5 GiB |
aid2classic-ggml-q5_0.bin |
1.2 GiB |
1.6 GiB |
aid2classic-ggml-q5_1.bin |
1.2 GiB |
1.7 GiB |
aid2classic-ggml-q8_0.bin |
1.7 GiB |
2.2 GiB |
aid2classic-ggml-f16.bin |
3.2 GiB |
3.6 GiB |
Description:
- 2019 AI Dungeon users may recognize this model as the same one that powered the open-source AI Dungeon 2 project and its various forks. This was before AI Dungeon 2 moved to its own website and consequently rebranded to "AI Dungeon".
- 2020-2021 AI Dungeon users may recognize this model as "Classic", the free tier below Griffin (free, but later used "energy") and Dragon (subscription).
- If you want a better model trained on the same dataset at the cost of higher hardware requirements, check out Spring Dragon 13B, intended to replicate 2020 AI Dungeon's "Dragon" experience on local hardware.
- The motivation behind these quantizations was that Henk717/ai-dungeon2-classic-ggml was older and lacked other quantization formats. The workflow for this quantization was also different: henk717's mentions being converted to a 16-bit Pytorch checkpoint before being converted to GGML. This one was converted straight from Tensorflow to 16-bit GGML before being quantized.
Notes:
- KoboldCpp [bfc696f] was tested without OpenBLAS.
- Oobabooga [895ec9d] was tested with with the
--model <model> --loader ctransformers --model_type gpt2
launch arguments.
- ggerganov/ggml [8ca2c19]'s gpt-2 conversion script was used for conversion and quantization.
- The original model was found in the
generator/gpt2/models/model_v5
directory of AI Dungeon 2 Unleashed.