shivvamm
/

llama-3.18B-OpenMathInstruct-2

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

shivvamm commited on Oct 21

Commit

0e3d047

•

1 Parent(s): f49934f

Update README.md

Files changed (1) hide show

README.md +102 -1

README.md CHANGED Viewed

@@ -10,4 +10,105 @@ tags:
 - maths
 - art
 library_name: transformers
----

 - maths
 - art
 library_name: transformers
+---
+# Llama-3.1 8B - OpenMathInstruct-2
+This model is a fine-tuned version of Llama-3.1 8B designed specifically for solving mathematical problems. Leveraging the OpenMath dataset, it excels in generating accurate mathematical solutions based on instructional prompts.
+## Table of Contents
+- [Model Description](#model-description)
+- [Usage](#usage)
+  - [Installation](#installation)
+  - [Loading the Model](#loading-the-model)
+  - [Inference](#inference)
+    - [Normal Inference](#normal-inference)
+    - [Streaming Inference](#streaming-inference)
+- [Benefits](#benefits)
+- [License](#license)
+## Model Description
+The Llama-3.1 8B model has been fine-tuned with the OpenMath dataset, which enhances its capability to interpret and solve mathematical problems. This model is particularly adept at understanding instructions and providing appropriate solutions.
+## Usage
+### Installation
+To use this model, ensure you have the required libraries installed:
+```bash
+pip install torch transformers unsloth
+```
+### Loading the Model
+You can load the model as follows:
+```python
+from unsloth import FastLanguageModel
+model_name = "shivvamm/llama-3.18B-OpenMathInstruct-2"
+model = FastLanguageModel.from_pretrained(model_name)
+tokenizer = FastLanguageModel.from_pretrained(model_name, tokenizer=True)
+```
+### Inference
+#### Normal Inference
+For standard inference, you can use the following code snippet:
+```python
+input_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+Continue the Fibonacci sequence.
+### Input:
+1, 1, 2, 3, 5, 8
+### Response:
+"""
+inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens=64)
+response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
+print(response)
+```
+#### Streaming Inference
+For a more interactive experience, you can use streaming inference, which outputs tokens as they are generated:
+```python
+from transformers import TextStreamer
+input_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+Continue the Fibonacci sequence.
+### Input:
+1, 1, 2, 3, 5, 8
+### Response:
+"""
+inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
+text_streamer = TextStreamer(tokenizer)
+model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)
+```
+## Benefits
+- **Fast Inference:** The model is optimized for speed, allowing for efficient generation of responses.
+- **High Accuracy:** Fine-tuned specifically for mathematical instructions, enhancing its problem-solving capabilities.
+- **Low Memory Usage:** Utilizing 4-bit quantization enables running on lower-end GPUs without running out of memory.
+## License
+This model is licensed under the MIT License. See the LICENSE file for more information.