Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -50,6 +50,7 @@ For training data, we generate long contexts by augmenting [SlimPajama](https://
50
  | GPU Type | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S |
51
  | Minutes to Train (Wall)| 202 | 555 | 61 | 87 |
52
 
 
53
  **Evaluation:**
54
 
55
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585dc9be92bc5f258156bd6/mWxIGZNi3ejlmeIDWafKu.png)
@@ -75,8 +76,11 @@ HAYSTACK3:
75
  All boxes not pictured for Haystack 1 and 3 are 100% accurate. Haystacks 1,2 and 3 are further detailed in this [blog post](https://gradient.ai/blog/the-haystack-matters-for-niah-evals).
76
 
77
  **Quants:**
78
- - [GGUF](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF)
79
  - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
 
 
 
80
 
81
  ## The Gradient AI Team
82
 
 
50
  | GPU Type | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S | NVIDIA L40S |
51
  | Minutes to Train (Wall)| 202 | 555 | 61 | 87 |
52
 
53
+
54
  **Evaluation:**
55
 
56
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585dc9be92bc5f258156bd6/mWxIGZNi3ejlmeIDWafKu.png)
 
76
  All boxes not pictured for Haystack 1 and 3 are 100% accurate. Haystacks 1,2 and 3 are further detailed in this [blog post](https://gradient.ai/blog/the-haystack-matters-for-niah-evals).
77
 
78
  **Quants:**
79
+ - [GGUF by Crusoe](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF). Note that you need to add 128009 as [special token with llama.cpp](https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13).
80
  - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
81
+ - [Ollama](https://ollama.com/library/llama3-gradient)
82
+ - vLLM docker image, recommended to load via `--max-model-len 32768`
83
+ - If you are interested in a hosted version, drop us a mail below.
84
 
85
  ## The Gradient AI Team
86