Update README.md
Browse files
README.md
CHANGED
@@ -235,6 +235,26 @@ Average: 41.65
|
|
235 |
| | |mc2 |0.5911|± |0.0158|
|
236 |
```
|
237 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
238 |
# Inference Code
|
239 |
|
240 |
Here is example code using HuggingFace Transformers to inference the model (note: in 4bit, it will require around 5GB of VRAM)
|
|
|
235 |
| | |mc2 |0.5911|± |0.0158|
|
236 |
```
|
237 |
|
238 |
+
# Function Calling Evaluations
|
239 |
+
|
240 |
+
We worked with Fireworks.AI on evaluations by starting off with their Function Calling eval dataset, fixing some unsolveable ones, and generating a second eval dataset for JSON mode.
|
241 |
+
|
242 |
+
## Function Calling Accuracy: 91%
|
243 |
+
|
244 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/XF3Zii4-QhE2yjWwHr_v4.png)
|
245 |
+
|
246 |
+
## JSON Mode Accuracy: 84%
|
247 |
+
|
248 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/8H2iyjh5wyP2FtLq2LCed.png)
|
249 |
+
|
250 |
+
Run the evaluator yourself using @interstellarninja's codebase here:
|
251 |
+
https://github.com/interstellarninja/function-calling-eval
|
252 |
+
|
253 |
+
You can find the evaluation datasets here:
|
254 |
+
https://huggingface.co/datasets/NousResearch/func-calling-eval
|
255 |
+
https://huggingface.co/datasets/NousResearch/json-mode-eval
|
256 |
+
|
257 |
+
|
258 |
# Inference Code
|
259 |
|
260 |
Here is example code using HuggingFace Transformers to inference the model (note: in 4bit, it will require around 5GB of VRAM)
|