superhot-7b-8k-no-rlhf-test-GGML

Note: LLAMA_ROPE_SCALE from PR #1967 needs to be set to 0.25

Merged base LLaMA and LoRA with this: https://github.com/tloen/alpaca-lora

Base LLaMA 7B: https://huggingface.co/huggyllama/llama-7b

SuperHOT 7B 8k no-rlhf-test LoRA: https://huggingface.co/kaiokendev/superhot-7b-8k-no-rlhf-test

BASE_MODEL=huggyllama_llama-7b LORA=kaiokendev_superhot-7b-8k-no-rlhf-test python export_hf_checkpoint.py

Converted and quantized with llama.cpp commit 447ccbe:

python convert.py superhot-7b-8k-safetensors --outtype f16 --outfile superhot-7b-8k-no-rlhf-test.ggmlv3.f16.bin
./bin/quantize superhot-7b-8k-no-rlhf-test.ggmlv3.f16.bin superhot-7b-8k-no-rlhf-test.ggmlv3.Q2_K.bin Q2_K