shivvamm commited on
Commit
0e3d047
1 Parent(s): f49934f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -1
README.md CHANGED
@@ -10,4 +10,105 @@ tags:
10
  - maths
11
  - art
12
  library_name: transformers
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - maths
11
  - art
12
  library_name: transformers
13
+ ---
14
+
15
+
16
+
17
+
18
+ # Llama-3.1 8B - OpenMathInstruct-2
19
+
20
+ This model is a fine-tuned version of Llama-3.1 8B designed specifically for solving mathematical problems. Leveraging the OpenMath dataset, it excels in generating accurate mathematical solutions based on instructional prompts.
21
+
22
+ ## Table of Contents
23
+ - [Model Description](#model-description)
24
+ - [Usage](#usage)
25
+ - [Installation](#installation)
26
+ - [Loading the Model](#loading-the-model)
27
+ - [Inference](#inference)
28
+ - [Normal Inference](#normal-inference)
29
+ - [Streaming Inference](#streaming-inference)
30
+ - [Benefits](#benefits)
31
+ - [License](#license)
32
+
33
+ ## Model Description
34
+
35
+ The Llama-3.1 8B model has been fine-tuned with the OpenMath dataset, which enhances its capability to interpret and solve mathematical problems. This model is particularly adept at understanding instructions and providing appropriate solutions.
36
+
37
+ ## Usage
38
+
39
+ ### Installation
40
+
41
+ To use this model, ensure you have the required libraries installed:
42
+
43
+ ```bash
44
+ pip install torch transformers unsloth
45
+ ```
46
+
47
+ ### Loading the Model
48
+
49
+ You can load the model as follows:
50
+
51
+ ```python
52
+ from unsloth import FastLanguageModel
53
+
54
+ model_name = "shivvamm/llama-3.18B-OpenMathInstruct-2"
55
+ model = FastLanguageModel.from_pretrained(model_name)
56
+ tokenizer = FastLanguageModel.from_pretrained(model_name, tokenizer=True)
57
+ ```
58
+
59
+ ### Inference
60
+
61
+ #### Normal Inference
62
+
63
+ For standard inference, you can use the following code snippet:
64
+
65
+ ```python
66
+ input_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
67
+
68
+ ### Instruction:
69
+ Continue the Fibonacci sequence.
70
+
71
+ ### Input:
72
+ 1, 1, 2, 3, 5, 8
73
+
74
+ ### Response:
75
+ """
76
+
77
+ inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
78
+ outputs = model.generate(**inputs, max_new_tokens=64)
79
+ response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
80
+ print(response)
81
+ ```
82
+
83
+ #### Streaming Inference
84
+
85
+ For a more interactive experience, you can use streaming inference, which outputs tokens as they are generated:
86
+
87
+ ```python
88
+ from transformers import TextStreamer
89
+
90
+ input_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
91
+
92
+ ### Instruction:
93
+ Continue the Fibonacci sequence.
94
+
95
+ ### Input:
96
+ 1, 1, 2, 3, 5, 8
97
+
98
+ ### Response:
99
+ """
100
+
101
+ inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
102
+ text_streamer = TextStreamer(tokenizer)
103
+ model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)
104
+ ```
105
+
106
+ ## Benefits
107
+
108
+ - **Fast Inference:** The model is optimized for speed, allowing for efficient generation of responses.
109
+ - **High Accuracy:** Fine-tuned specifically for mathematical instructions, enhancing its problem-solving capabilities.
110
+ - **Low Memory Usage:** Utilizing 4-bit quantization enables running on lower-end GPUs without running out of memory.
111
+
112
+ ## License
113
+
114
+ This model is licensed under the MIT License. See the LICENSE file for more information.