chargoddard/llama-2-26b-trenchcoat-stack

Llama 2 13b is a pretty decent language model. You know what's probably better? Two Llama 2 13b models. In a trenchcoat.

Produced by bakllama.py with this config file:

layer_slices:
  - model: TheBloke/Llama-2-13B-fp16
    start: 0
    end: 40
  - model: TheBloke/Llama-2-13B-fp16
    start: 0
    end: 40

No fine tuning was done on this model. Yes, it's still coherent somehow.

Benchmark results:

Benchmark	Llama2-13b	Llama2-26b-tcs	Percent Change
ARC	59.3	55.03	-7.2%
HellaSwag	82.15	79.9	-2.74%
MMLU	55.67	53.73	-3.48%
TruthfulQA	37.39	40.48	+5.59%
Average	58.63	57.29	-2.29%
Average Minus TQA	65.70	62.85	-4.34%

This tells us two very important things:

chargoddard
/

llama-2-26b-trenchcoat-stack