File size: 8,019 Bytes
e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 83501d4 e5cb4c4 83501d4 06cae33 83501d4 06cae33 83501d4 7b35981 83501d4 06cae33 83501d4 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 4c32805 e5cb4c4 7b35981 4c32805 06cae33 e5cb4c4 4c32805 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
---
library_name: transformers
tags:
- trl
- sft
---
# Model Card for phi-2-function-calling
## Model Overview
### Summary of the Model
The primary purpose of this fine-tuned model is **Function Calling**. It is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) specifically adapted to handle function-calling tasks efficiently. The model can generate structured text, making it particularly suited for scenarios requiring automated function invocation based on textual instructions.
### Model Type
## Model Details
### Model Description
- **Developed by:** Microsoft and Fine-tuned by Carlos Rodrigues (at DataKensei)
- **Model Type:** Text Generation, trained for Function Calling tasks.
- **Language(s):** English
- **License:** MIT License
- **Finetuned from model:** [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
### Model Sources
- **Repository:** - **Repository:** [DataKensei/phi-2-function-calling](https://huggingface.co/DataKensei/phi-2-function-calling)
## Uses
### Direct Use
The model is directly usable for generating function calls based on user prompts. This includes structured tasks like scheduling meetings, calculating savings, or any scenario where a text input should translate into an actionable function.
### Downstream Use
While the model is primarily designed for function calling, it can be fine-tuned further or integrated into larger systems where similar structured text generation is required. For example, it could be part of a larger chatbot system that automates task handling.
### Out-of-Scope Use
The model is not designed for tasks unrelated to structured text generation or function calling. Misuse might include attempts to use it for general-purpose language modeling or content generation beyond its specialized training focus.
## Bias, Risks, and Limitations
### Biases
The model may inherit biases from the base model (microsoft/phi-2), particularly those related to the English language and specific function-calling tasks. Users should be aware of potential biases in task framing and language interpretation.
### Limitations
- **Task-Specific**: The model is specialized for function-calling tasks and might not perform well on other types of text generation tasks.
- **English Only**: The model is limited to English, and performance in other languages is not guaranteed.
### Recommendations
Users should test the model in their specific environment to ensure it performs as expected for the desired use case. Awareness of the model's biases and limitations is crucial when deploying it in critical systems.
## How to Get Started with the Model
You can use the following code snippet to get started with the model:
```python
from transformers import pipeline
# Load the model and tokenizer
pipe = pipeline(task="text-generation", model="DataKensei/phi-2-function-calling")
# Example prompt
prompt = '''
<|im_start|system
You are a helpful assistant with access to the following functions. Use these functions when they are relevant to assist with a user's request
[
{
"name": "calculate_retirement_savings",
"description": "Project the savings at retirement based on current contributions.",
"parameters": {
"type": "object",
"properties": {
"current_age": {
"type": "integer",
"description": "The current age of the individual."
},
"retirement_age": {
"type": "integer",
"description": "The desired retirement age."
},
"current_savings": {
"type": "number",
"description": "The current amount of savings."
},
"monthly_contribution": {
"type": "number",
"description": "The monthly contribution towards retirement savings."
}
},
"required": ["current_age", "retirement_age", "current_savings", "monthly_contribution"]
}
}
]
<|im_start|user
I am currently 40 years old and plan to retire at 65. I have no savings at the moment, but I intend to save $500 every month. Could you project the savings at retirement based on current contributions?
'''
result = pipe(prompt)
print(result[0]['generated_text'])
```
## Training Details
### Training Data
The model was fine-tuned using a syntectic dataset of function-calling prompts and responses. The data was curated to cover a wide range of potential function calls, ensuring the model's applicability to various structured text generation tasks.
The script to generate the data can be found in this [repository](https://xxxxxxxx).
### Training Procedure
- **Training regime:** The model was fine-tuned using 4-bit precision with `bnb_4bit` quantization on NVIDIA GPUs.
- **Optimizer:** PagedAdamW (32-bit)
- **Learning Rate:** 2e-4
- **Batch Size:** 2 (with gradient accumulation steps = 1)
- **Epochs:** 1
#### Preprocessing
The training and evaluation data was generated using this [repository](https://xxxxxxxx).
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
The model was evaluated using a separate test set, comprising 10% of the original dataset, containing various function-calling scenarios.
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
<!-- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). -->
Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432 kgCO$_2$eq/kWh. A cumulative of 10 hours of computation was performed on hardware of type GTX 1080 (TDP of 180W).
Total emissions are estimated to be 0.78 kgCO of which 0 percents were directly offset.
Estimations were conducted using the [MachineLearning Impact calculator](https://mlco2.github.io/impact#compute) presented in presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700)
```BibTeX
@article{lacoste2019quantifying,
title={Quantifying the Carbon Emissions of Machine Learning},
author={Lacoste, Alexandre and Luccioni, Alexandra and Schmidt, Victor and Dandres, Thomas},
journal={arXiv preprint arXiv:1910.09700},
year={2019}
}
```
- **Hardware Type:** NVIDIA GPUs (GTX 1080)
- **Hours used:** 10
- **Cloud Provider:** Private Infrastructure
- **Carbon Emitted:** 0.78 kgCO
## Technical Specifications
### Model Architecture and Objective
The model is based on the "microsoft/phi-2" architecture, fine-tuned specifically for function-calling tasks. The objective was to optimize the model's ability to generate structured text suitable for automated function execution.
### Compute Infrastructure
[More Information Needed]
#### Hardware
The model was trained on NVIDIA GPUs.
#### Software
The training used PyTorch and the Hugging Face Transformers library, with additional support from the PEFT library for fine-tuning.
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
```BibTeX
@misc{phi2functioncalling,
title={phi-2-function-calling},
author={Carlos Rodrigues},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/DataKensei/phi-2-function-calling}},
}
```
## Model Card Contact
For more information, please contact Carlos Rodrigues at DataKensei. |