Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF
This model was converted to GGUF format from arcee-ai/Arcee-Maestro-7B-Preview
using llama.cpp via the ggml.ai's GGUF-my-repo space.
Refer to the original model card for more details on the model.
Arcee-Maestro-7B-Preview (7B) is Arcee's first reasoning model trained with reinforment learning. It is based on the Qwen2.5-7B DeepSeek-R1 distillation DeepSeek-R1-Distill-Qwen-7B with further GPRO training. Though this is just a preview of our upcoming work, it already shows promising improvements to mathematical and coding abilities across a range of tasks.
Intended Use Cases
Advanced reasoning
Mathematics
Coding
Training & Fine-Tuning
Initial Training: Began with DeepSeek-R1-Distill-Qwen-7B GRPO: Trained on 450,000 verified math problems Additional bootstrapped coding examples
Performance
Arcee-Maestro-7B-Preview shows strong performance in mathematics as well as coding, competing against even O1 preview, a model far surprassing its size.
Limitations
Context Length: 128k Tokens (may vary depending on the final tokenizer settings and system resources). Knowledge Cut-off: Training data may not reflect the latest events or developments beyond June 2024.
Ethical Considerations
Content Generation Risks: Like any language model, Arcee-Maestro-7B-Preview can generate potentially harmful or biased content if prompted in certain ways.
License
Arcee-Maestro-7B-Preview (7B) is released under the Apache-2.0 License. You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.
CLI:
llama-cli --hf-repo Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF --hf-file arcee-maestro-7b-preview-q5_k_s.gguf -p "The meaning to life and the universe is"
Server:
llama-server --hf-repo Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF --hf-file arcee-maestro-7b-preview-q5_k_s.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1
flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF --hf-file arcee-maestro-7b-preview-q5_k_s.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF --hf-file arcee-maestro-7b-preview-q5_k_s.gguf -c 2048
- Downloads last month
- 18
Model tree for Triangle104/Arcee-Maestro-7B-Preview-Q5_K_S-GGUF
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B