gradientai
/

Llama-3-8B-Instruct-Gradient-1048k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

michaelfeil commited on May 2

Commit

6f4ed08

•

1 Parent(s): 3c5e6ea

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -54,8 +54,8 @@ For training data, we generate long contexts by augmenting [SlimPajama](https://
 - [GGUF by Crusoe](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF). Note that you need to add 128009 as [special token with llama.cpp](https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13).
 - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
 - [Ollama](https://ollama.com/library/llama3-gradient)
-- vLLM docker image, recommended to load via `--max-model-len 65536`
 ## The Gradient AI Team

 - [GGUF by Crusoe](https://huggingface.co/crusoeai/Llama-3-8B-Instruct-1048k-GGUF). Note that you need to add 128009 as [special token with llama.cpp](https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k/discussions/13).
 - [MLX-4bit](https://huggingface.co/mlx-community/Llama-3-8B-Instruct-1048k-4bit)
 - [Ollama](https://ollama.com/library/llama3-gradient)
+- vLLM docker image, recommended to load via `--max-model-len 32768`
+- If you are interested in a hosted version, drop us a mail below.
 ## The Gradient AI Team