Triangle104
/

OpenMath2-Llama3.1-8B-Q6_K-GGUF

@@ -17,78 +17,6 @@ library_name: transformers
 This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
----
-Model details:
--
-OpenMath2-Llama3.1-8B is obtained by finetuning Llama3.1-8B-Base with OpenMathInstruct-2.
-The model outperforms Llama3.1-8B-Instruct on all the popular math benchmarks we evaluate on, especially on MATH by 15.9%.
-[Performance of Llama-3.1-8B-Instruct as it is trained on increasing proportions of OpenMathInstruct-2] [Comparison of OpenMath2-Llama3.1-8B vs. Llama-3.1-8B-Instruct across MATH levels]
-Model 	GSM8K 	MATH 	AMC 2023 	AIME 2024 	Omni-MATH
-Llama3.1-8B-Instruct 	84.5 	51.9 	9/40 	2/30 	12.7
-OpenMath2-Llama3.1-8B (nemo | HF) 	91.7 	67.8 	16/40 	3/30 	22.0
-+ majority@256 	94.1 	76.1 	23/40 	3/30 	24.6
-Llama3.1-70B-Instruct 	95.8 	67.9 	19/40 	6/30 	19.0
-OpenMath2-Llama3.1-70B (nemo | HF) 	94.9 	71.9 	20/40 	4/30 	23.1
-+ majority@256 	96.0 	79.6 	24/40 	6/30 	27.6
-The pipeline we used to produce the data and models is fully open-sourced!
-    Code
-    Models
-    Dataset
-See our paper to learn more details!
-How to use the models?
-Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens). Please note that these models have not been instruction tuned on general data and thus might not provide good answers outside of math domain.
-We recommend using instructions in our repo to run inference with these models, but here is an example of how to do it through transformers api:
-import transformers
-import torch
-model_id = "nvidia/OpenMath2-Llama3.1-8B"
-pipeline = transformers.pipeline(
-    "text-generation",
-    model=model_id,
-    model_kwargs={"torch_dtype": torch.bfloat16},
-    device_map="auto",
-)
-messages = [
-    {
-        "role": "user",
-        "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" +
-        "What is the minimum value of $a^2+6a-7$?"},
-]
-outputs = pipeline(
-    messages,
-    max_new_tokens=4096,
-)
-print(outputs[0]["generated_text"][-1]['content'])
-Reproducing our results
-We provide all instructions to fully reproduce our results.
-Citation
-If you find our work useful, please consider citing us!
-@article{toshniwal2024openmath2,
-  title   = {OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data},
-  author  = {Shubham Toshniwal and Wei Du and Ivan Moshkov and  Branislav Kisacanin and Alexan Ayrapetyan and Igor Gitman},
-  year    = {2024},
-  journal = {arXiv preprint arXiv:2410.01560}
-}
-Terms of use
-By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy
----
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)