Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3.2
|
3 |
+
datasets:
|
4 |
+
- teknium/OpenHermes-2.5
|
5 |
+
- NousResearch/hermes-function-calling-v1
|
6 |
+
base_model:
|
7 |
+
- minpeter/QLoRA-Llama-3.2-1B-chatml-tool-v2
|
8 |
+
- minpeter/Llama-3.2-1B-AlternateTokenizer-chatml
|
9 |
+
language:
|
10 |
+
- en
|
11 |
+
pipeline_tag: text-generation
|
12 |
+
library_name: transformers
|
13 |
+
tags:
|
14 |
+
- axolotl
|
15 |
+
- merge
|
16 |
+
---
|
17 |
+
|
18 |
+
|
19 |
+
The only difference from Llama-3.2-1B-chatml-tool-v1 is that it uses AlternateTokenizer, which does not define tool-related tokens (<tools>, <tool_call>, <tool_response>).
|
20 |
+
|
21 |
+
In the case of the existing tool-AlternateTokenizer, the <tool_call> tag was not properly generated before the function call, but in v2, it was observed that it performed well when trained with the general AlternateTokenizer.
|
22 |
+
|
23 |
+
need to check whether this phenomenon is repeated in larger models (3B, 8B).
|