Triangle104 commited on
Commit
8363534
1 Parent(s): aa412a2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +0 -72
README.md CHANGED
@@ -17,78 +17,6 @@ library_name: transformers
17
  This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
19
 
20
- ---
21
- Model details:
22
- -
23
- OpenMath2-Llama3.1-8B is obtained by finetuning Llama3.1-8B-Base with OpenMathInstruct-2.
24
-
25
- The model outperforms Llama3.1-8B-Instruct on all the popular math benchmarks we evaluate on, especially on MATH by 15.9%.
26
- [Performance of Llama-3.1-8B-Instruct as it is trained on increasing proportions of OpenMathInstruct-2] [Comparison of OpenMath2-Llama3.1-8B vs. Llama-3.1-8B-Instruct across MATH levels]
27
- Model GSM8K MATH AMC 2023 AIME 2024 Omni-MATH
28
- Llama3.1-8B-Instruct 84.5 51.9 9/40 2/30 12.7
29
- OpenMath2-Llama3.1-8B (nemo | HF) 91.7 67.8 16/40 3/30 22.0
30
- + majority@256 94.1 76.1 23/40 3/30 24.6
31
- Llama3.1-70B-Instruct 95.8 67.9 19/40 6/30 19.0
32
- OpenMath2-Llama3.1-70B (nemo | HF) 94.9 71.9 20/40 4/30 23.1
33
- + majority@256 96.0 79.6 24/40 6/30 27.6
34
-
35
- The pipeline we used to produce the data and models is fully open-sourced!
36
-
37
- Code
38
- Models
39
- Dataset
40
-
41
- See our paper to learn more details!
42
- How to use the models?
43
-
44
- Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens). Please note that these models have not been instruction tuned on general data and thus might not provide good answers outside of math domain.
45
-
46
- We recommend using instructions in our repo to run inference with these models, but here is an example of how to do it through transformers api:
47
-
48
- import transformers
49
- import torch
50
-
51
- model_id = "nvidia/OpenMath2-Llama3.1-8B"
52
-
53
- pipeline = transformers.pipeline(
54
- "text-generation",
55
- model=model_id,
56
- model_kwargs={"torch_dtype": torch.bfloat16},
57
- device_map="auto",
58
- )
59
-
60
- messages = [
61
- {
62
- "role": "user",
63
- "content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" +
64
- "What is the minimum value of $a^2+6a-7$?"},
65
- ]
66
-
67
- outputs = pipeline(
68
- messages,
69
- max_new_tokens=4096,
70
- )
71
- print(outputs[0]["generated_text"][-1]['content'])
72
-
73
- Reproducing our results
74
-
75
- We provide all instructions to fully reproduce our results.
76
- Citation
77
-
78
- If you find our work useful, please consider citing us!
79
-
80
- @article{toshniwal2024openmath2,
81
- title = {OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data},
82
- author = {Shubham Toshniwal and Wei Du and Ivan Moshkov and Branislav Kisacanin and Alexan Ayrapetyan and Igor Gitman},
83
- year = {2024},
84
- journal = {arXiv preprint arXiv:2410.01560}
85
- }
86
-
87
- Terms of use
88
-
89
- By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy
90
-
91
- ---
92
  ## Use with llama.cpp
93
  Install llama.cpp through brew (works on Mac and Linux)
94
 
 
17
  This model was converted to GGUF format from [`nvidia/OpenMath2-Llama3.1-8B`](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
18
  Refer to the [original model card](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B) for more details on the model.
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Use with llama.cpp
21
  Install llama.cpp through brew (works on Mac and Linux)
22