![](https://cdn-avatars.huggingface.co/v1/production/uploads/65c992424936ab38ecf706b0/aq7vuHFPO1S93fwJk0Cuq.jpeg)
Highlighted work
My "greatest hits", sort of
Text Generation • Updated • 58 • 2Note The addition of o1-inspired reasoning uplifted the Instruct model on most benchmarks. As of the initial merge release date, this is the second highest benching Llama 3.x 8B model that I've achieved on the newer Open LLM leaderboard.
grimjim/SauerHuatuoSkywork-o1-Llama-3.1-8B-GGUF
Text Generation • Updated • 109
grimjim/DeepSauerHuatuoSkywork-R1-o1-Llama-3.1-8B
Text Generation • Updated • 41 • 4Note Merging in a touch of DeepSeek R1 distillation improved benchmarks more than it hurt them. This is currently my highest benching Llama 3.x 8B model on the newer Open LLM Leaderboard.
grimjim/HuatuoSkywork-o1-Llama-3.1-8B
Text Generation • Updated • 112Note This merge of o1 reasoning models achieved an unexpectedly high MATH Level 5 score of 33.99%, which was the highest I saw at the time for Llama 3.x 8B models on the Open LLM Leaderboard.
grimjim/llama-3-Nephilim-v3-8B
Text Generation • Updated • 135 • 13Note Proof of concept that a text completion model, based on Instruct in this case, doesn't need any fine-tuning specifically targeting roleplay. All merge components are academic in origin.
grimjim/llama-3-Nephilim-v3-8B-GGUF
Text Generation • Updated • 156 • 12
grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter
Text Generation • Updated • 5.31k • 29Note Llama 3.1 8B "abliterated" via transfer of the feature via a LoRA. There's probably some damage to the model that could be fixed with additional fine-tuning, as that's a common consequence of abliteration.
grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter-GGUF
Text Generation • Updated • 567 • 25
grimjim/Llama-3-Instruct-abliteration-LoRA-8B
Updated • 7Note The LoRA adapter obtained from Llama 3, and later applied against Llama 3.1.
grimjim/kukulemon-7B
Text Generation • Updated • 52 • 11Note One of my first merges, combining two smart models with a roleplay-oriented merge. Someone on YouTube called out this Mistral v0.1 7B architecture model in a video.
grimjim/kukulemon-7B-GGUF
Text Generation • Updated • 447 • 2