File size: 8,019 Bytes
e5cb4c4
 
 
 
 
 
 
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
 
 
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
 
 
 
 
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
4c32805
e5cb4c4
 
 
4c32805
e5cb4c4
4c32805
e5cb4c4
4c32805
 
 
 
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
4c32805
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e5cb4c4
 
 
 
 
 
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
4c32805
 
 
 
 
e5cb4c4
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
 
 
 
 
 
 
 
4c32805
e5cb4c4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83501d4
e5cb4c4
83501d4
 
 
 
06cae33
83501d4
 
06cae33
83501d4
7b35981
83501d4
 
 
 
 
 
06cae33
83501d4
 
 
 
 
 
e5cb4c4
4c32805
e5cb4c4
 
 
4c32805
e5cb4c4
 
 
 
 
 
 
4c32805
e5cb4c4
 
 
4c32805
e5cb4c4
4c32805
e5cb4c4
 
 
 
7b35981
4c32805
 
 
 
 
 
 
06cae33
e5cb4c4
 
 
4c32805
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
---
library_name: transformers
tags:
- trl
- sft
---

# Model Card for phi-2-function-calling

## Model Overview

### Summary of the Model

The primary purpose of this fine-tuned model is **Function Calling**. It is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) specifically adapted to handle function-calling tasks efficiently. The model can generate structured text, making it particularly suited for scenarios requiring automated function invocation based on textual instructions.

### Model Type





## Model Details

### Model Description

- **Developed by:** Microsoft and Fine-tuned by Carlos Rodrigues (at DataKensei)
- **Model Type:** Text Generation, trained for Function Calling tasks.
- **Language(s):** English
- **License:** MIT License
- **Finetuned from model:** [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) 

### Model Sources

- **Repository:** - **Repository:** [DataKensei/phi-2-function-calling](https://huggingface.co/DataKensei/phi-2-function-calling)

## Uses

### Direct Use

The model is directly usable for generating function calls based on user prompts. This includes structured tasks like scheduling meetings, calculating savings, or any scenario where a text input should translate into an actionable function.

### Downstream Use

While the model is primarily designed for function calling, it can be fine-tuned further or integrated into larger systems where similar structured text generation is required. For example, it could be part of a larger chatbot system that automates task handling.

### Out-of-Scope Use

The model is not designed for tasks unrelated to structured text generation or function calling. Misuse might include attempts to use it for general-purpose language modeling or content generation beyond its specialized training focus.

## Bias, Risks, and Limitations

### Biases

The model may inherit biases from the base model (microsoft/phi-2), particularly those related to the English language and specific function-calling tasks. Users should be aware of potential biases in task framing and language interpretation.

### Limitations

- **Task-Specific**: The model is specialized for function-calling tasks and might not perform well on other types of text generation tasks.
- **English Only**: The model is limited to English, and performance in other languages is not guaranteed.

### Recommendations

Users should test the model in their specific environment to ensure it performs as expected for the desired use case. Awareness of the model's biases and limitations is crucial when deploying it in critical systems.

## How to Get Started with the Model

You can use the following code snippet to get started with the model:

```python
from transformers import pipeline

# Load the model and tokenizer
pipe = pipeline(task="text-generation", model="DataKensei/phi-2-function-calling")

# Example prompt
prompt = '''
<|im_start|system
You are a helpful assistant with access to the following functions. Use these functions when they are relevant to assist with a user's request
[
    {
        "name": "calculate_retirement_savings",
        "description": "Project the savings at retirement based on current contributions.",
        "parameters": {
            "type": "object",
            "properties": {
                "current_age": {
                    "type": "integer",
                    "description": "The current age of the individual."
                },
                "retirement_age": {
                    "type": "integer",
                    "description": "The desired retirement age."
                },
                "current_savings": {
                    "type": "number",
                    "description": "The current amount of savings."
                },
                "monthly_contribution": {
                    "type": "number",
                    "description": "The monthly contribution towards retirement savings."
                }
            },
            "required": ["current_age", "retirement_age", "current_savings", "monthly_contribution"]
        }
    }
]
<|im_start|user
I am currently 40 years old and plan to retire at 65. I have no savings at the moment, but I intend to save $500 every month. Could you project the savings at retirement based on current contributions?
'''

result = pipe(prompt)
print(result[0]['generated_text'])
```


## Training Details

### Training Data

The model was fine-tuned using a syntectic dataset of function-calling prompts and responses. The data was curated to cover a wide range of potential function calls, ensuring the model's applicability to various structured text generation tasks.

The script to generate the data can be found in this [repository](https://xxxxxxxx).

### Training Procedure

- **Training regime:** The model was fine-tuned using 4-bit precision with `bnb_4bit` quantization on NVIDIA GPUs.
- **Optimizer:** PagedAdamW (32-bit)
- **Learning Rate:** 2e-4
- **Batch Size:** 2 (with gradient accumulation steps = 1)
- **Epochs:** 1

#### Preprocessing 

The training and evaluation data was generated using this [repository](https://xxxxxxxx).


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated using a separate test set, comprising 10% of the original dataset, containing various function-calling scenarios.


#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

[More Information Needed]

### Results

[More Information Needed]

#### Summary


## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

<!-- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). -->



Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432 kgCO$_2$eq/kWh. A cumulative of 10 hours of computation was performed on hardware of type GTX 1080 (TDP of 180W).

Total emissions are estimated to be 0.78 kgCO of which 0 percents were directly offset.
    

Estimations were conducted using the [MachineLearning Impact calculator](https://mlco2.github.io/impact#compute) presented in presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700)

```BibTeX
@article{lacoste2019quantifying,
  title={Quantifying the Carbon Emissions of Machine Learning},
  author={Lacoste, Alexandre and Luccioni, Alexandra and Schmidt, Victor and Dandres, Thomas},
  journal={arXiv preprint arXiv:1910.09700},
  year={2019}
}
```
                        

- **Hardware Type:** NVIDIA GPUs (GTX 1080)
- **Hours used:** 10
- **Cloud Provider:** Private Infrastructure
- **Carbon Emitted:** 0.78 kgCO

## Technical Specifications

### Model Architecture and Objective

The model is based on the "microsoft/phi-2" architecture, fine-tuned specifically for function-calling tasks. The objective was to optimize the model's ability to generate structured text suitable for automated function execution.

### Compute Infrastructure

[More Information Needed]

#### Hardware

The model was trained on NVIDIA GPUs.

#### Software

The training used PyTorch and the Hugging Face Transformers library, with additional support from the PEFT library for fine-tuning.

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**
```BibTeX
@misc{phi2functioncalling,
  title={phi-2-function-calling},
  author={Carlos Rodrigues},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/DataKensei/phi-2-function-calling}},
}
```

## Model Card Contact

For more information, please contact Carlos Rodrigues at DataKensei.