Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ base_model:
|
|
18 |
|
19 |
# Meta-Llama-3-120B-Instruct
|
20 |
|
21 |
-
Meta-Llama-3-120B-Instruct is a
|
22 |
|
23 |
It was inspired by large merges like:
|
24 |
|
@@ -27,11 +27,13 @@ It was inspired by large merges like:
|
|
27 |
- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
|
28 |
- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
|
29 |
|
|
|
|
|
30 |
## π Applications
|
31 |
|
32 |
I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
|
33 |
|
34 |
-
Check the examples in the evaluation section to get an idea of its performance.
|
35 |
|
36 |
## β‘ Quantized models
|
37 |
|
@@ -43,13 +45,19 @@ Thanks to [Eric Hartford](https://huggingface.co/ehartford), [elinas](https://hu
|
|
43 |
|
44 |
## π Evaluation
|
45 |
|
46 |
-
|
47 |
|
48 |
* **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
|
49 |
* **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
|
50 |
* **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
|
51 |
* **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
## 𧩠Configuration
|
54 |
|
55 |
```yaml
|
|
|
18 |
|
19 |
# Meta-Llama-3-120B-Instruct
|
20 |
|
21 |
+
Meta-Llama-3-120B-Instruct is a [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) self-merge made with [MergeKit](https://github.com/arcee-ai/mergekit/tree/main).
|
22 |
|
23 |
It was inspired by large merges like:
|
24 |
|
|
|
27 |
- [cognitivecomputations/MegaDolphin-120b](https://huggingface.co/cognitivecomputations/MegaDolphin-120b)
|
28 |
- [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0).
|
29 |
|
30 |
+
Special thanks to [Eric Hartford](https://huggingface.co/ehartford) for both inspiring and evaluating this model and to [Charles Goddard](https://huggingface.co/chargoddard) for creating MergeKit.
|
31 |
+
|
32 |
## π Applications
|
33 |
|
34 |
I recommend using this model for creative writing. It uses the Llama 3 chat template with a default context window of 8K (can be extended with rope theta).
|
35 |
|
36 |
+
Check the examples in the evaluation section to get an idea of its performance. The model is generally quite unhinged but has a good writing style. It sometimes outputs typos and is a big fan of uppercase.
|
37 |
|
38 |
## β‘ Quantized models
|
39 |
|
|
|
45 |
|
46 |
## π Evaluation
|
47 |
|
48 |
+
This model is great for creative writing but struggles in other tasks. I'd say use it with caution and don't expect it to outperform GPT-4 outside of some specific use cases.
|
49 |
|
50 |
* **X thread by Eric Hartford (creative writing)**: https://twitter.com/erhartford/status/1787050962114207886
|
51 |
* **X thread by Daniel Kaiser (creative writing)**: https://twitter.com/spectate_or/status/1787257261309518101
|
52 |
* **X thread by Simon (reasoning)**: https://twitter.com/NewDigitalEdu/status/1787403266894020893
|
53 |
* **r/LocalLLaMa**: https://www.reddit.com/r/LocalLLaMA/comments/1cl525q/goliath_lovers_where_is_the_feedback_about/
|
54 |
|
55 |
+
### Creative Writing
|
56 |
+
|
57 |
+
Thanks to [Sam Paech](https://huggingface.co/sam-paech) for evaluating this model and sending me his outputs!
|
58 |
+
|
59 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/-LJ7ivCRIPR1ur-LJHk3m.png)
|
60 |
+
|
61 |
## 𧩠Configuration
|
62 |
|
63 |
```yaml
|