mlabonne commited on
Commit
0348118
β€’
1 Parent(s): a68b640

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -3
README.md CHANGED
@@ -18,7 +18,7 @@ base_model:
18
 
19
  # Meta-Llama-3-120B-Instruct
20
 
21
- Meta-Llama-3-120B-Instruct is a self-merge with [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
22
 
23
  It was inspired by large merges like:
24
 
@@ -27,11 +27,13 @@ It was inspired by large merges like:
27
  - [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
28
  - [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
29
 
 
 
30
  ## πŸ” Applications
31
 
32
  I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
33
 
34
- Check the examples in the evaluation section to get an idea of its performance.
35
 
36
  ## ⚑ Quantized models
37
 
@@ -43,13 +45,19 @@ Thanks to [Eric Hartford](https://huggingface.co/ehartford), [elinas](https://hu
43
 
44
  ## πŸ† Evaluation
45
 
46
- The model looks excellent for creating writing tasks, outperforming GPT-4. Thanks again to [Eric Hartford](https://huggingface.co/ehartford) for noticing this.
47
 
48
  * **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
49
  * **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
50
  * **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
51
  * **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
52
 
 
 
 
 
 
 
53
  ## 🧩 Configuration
54
 
55
  ```yaml
 
18
 
19
  # Meta-Llama-3-120B-Instruct
20
 
21
+ Meta-Llama-3-120B-Instruct is a [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) self-merge made with [MergeKit](https://github.com/arcee-ai/mergekit/tree/main).
22
 
23
  It was inspired by large merges like:
24
 
 
27
  - [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
28
  - [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
29
 
30
+ Special thanks to [Eric Hartford](https://huggingface.co/ehartford) for both inspiring and evaluating this model and to [Charles Goddard](https://huggingface.co/chargoddard) for creating MergeKit.
31
+
32
  ## πŸ” Applications
33
 
34
  I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
35
 
36
+ Check the examples in the evaluation section to get an idea of its performance. The model is generally quite unhinged but has a good writing style. It sometimes outputs typos and is a big fan of uppercase.
37
 
38
  ## ⚑ Quantized models
39
 
 
45
 
46
  ## πŸ† Evaluation
47
 
48
+ This model is great for creative writing but struggles in other tasks. I'd say use it with caution and don't expect it to outperform GPT-4 outside of some specific use cases.
49
 
50
  * **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
51
  * **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
52
  * **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
53
  * **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
54
 
55
+ ### Creative Writing
56
+
57
+ Thanks to [Sam Paech](https://huggingface.co/sam-paech) for evaluating this model and sending me his outputs!
58
+
59
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/-LJ7ivCRIPR1ur-LJHk3m.png)
60
+
61
  ## 🧩 Configuration
62
 
63
  ```yaml