mlabonne
/

Meta-Llama-3-120B-Instruct

@@ -18,7 +18,7 @@ base_model:
 # Meta-Llama-3-120B-Instruct
-Meta-Llama-3-120B-Instruct is a self-merge with [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
 It was inspired by large merges like:
@@ -27,11 +27,13 @@ It was inspired by large merges like:
 - [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
 - [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
 ## 🔍 Applications
 I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
-Check the examples in the evaluation section to get an idea of its performance.
 ## ⚡ Quantized models
@@ -43,13 +45,19 @@ Thanks to [Eric Hartford](https://huggingface.co/ehartford), [elinas](https://hu
 ## 🏆 Evaluation
-The model looks excellent for creating writing tasks, outperforming GPT-4. Thanks again to [Eric Hartford](https://huggingface.co/ehartford) for noticing this.
 * **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
 * **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
 * **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
 * **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
 ## 🧩 Configuration
 ```yaml

 # Meta-Llama-3-120B-Instruct
+Meta-Llama-3-120B-Instruct is a [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) self-merge made with [MergeKit](https://github.com/arcee-ai/mergekit/tree/main).
 It was inspired by large merges like:
 - [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
 - [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
+Special thanks to [Eric Hartford](https://huggingface.co/ehartford) for both inspiring and evaluating this model and to [Charles Goddard](https://huggingface.co/chargoddard) for creating MergeKit.
 ## 🔍 Applications
 I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
+Check the examples in the evaluation section to get an idea of its performance. The model is generally quite unhinged but has a good writing style. It sometimes outputs typos and is a big fan of uppercase.
 ## ⚡ Quantized models
 ## 🏆 Evaluation
+This model is great for creative writing but struggles in other tasks. I'd say use it with caution and don't expect it to outperform GPT-4 outside of some specific use cases.
 * **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
 * **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
 * **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
 * **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
+### Creative Writing
+Thanks to [Sam Paech](https://huggingface.co/sam-paech) for evaluating this model and sending me his outputs!
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/-LJ7ivCRIPR1ur-LJHk3m.png)
 ## 🧩 Configuration
 ```yaml