michaelfeil commited on
Commit
9e9cdad
1 Parent(s): 6f4ed08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -50,7 +50,32 @@ For training data, we generate long contexts by augmenting [SlimPajama](https://
50
  | GPU Type | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S |
51
  | Minutes to Train (Wall)| 202 | 555 | 61 | 87 |
52
 
53
- **Inference / Quants**:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  - [GGUF by Crusoe](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF). Note that you need to add 128009 as [special token with llama.cpp](https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13).
55
  - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
56
  - [Ollama](https://ollama.com/library/llama3-gradient)
 
50
  | GPU Type | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S |
51
  | Minutes to Train (Wall)| 202 | 555 | 61 | 87 |
52
 
53
+
54
+ **Evaluation:**
55
+
56
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585dc9be92bc5f258156bd6/mWxIGZNi3ejlmeIDWafKu.png)
57
+
58
+ ```
59
+ EVAL_MAX_CONTEXT_LENGTH=1040200
60
+ EVAL_MIN_CONTEXT_LENGTH=100
61
+ EVAL_CONTEXT_INTERVAL=86675
62
+ EVAL_DEPTH_INTERVAL=0.2
63
+ EVAL_RND_NUMBER_DIGITS=8
64
+
65
+ HAYSTACK1:
66
+ EVAL_GENERATOR_TOKENS=25
67
+
68
+ HAYSTACK2:
69
+ EVAL_CONTEXT_INTERVAL=173350
70
+ EVAL_GENERATOR_TOKENS=150000
71
+
72
+ HAYSTACK3:
73
+ EVAL_GENERATOR_TOKENS=925000
74
+ ```
75
+
76
+ All boxes not pictured for Haystack 1 and 3 are 100% accurate. Haystacks 1,2 and 3 are further detailed in this [blog post](https://gradient.ai/blog/the-haystack-matters-for-niah-evals).
77
+
78
+ **Quants:**
79
  - [GGUF by Crusoe](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF). Note that you need to add 128009 as [special token with llama.cpp](https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13).
80
  - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
81
  - [Ollama](https://ollama.com/library/llama3-gradient)