Update README.md
Browse files
README.md
CHANGED
@@ -6,14 +6,26 @@ library_name: transformers
|
|
6 |
pipeline_tag: text-generation
|
7 |
---
|
8 |
|
9 |
-
# Chikuma_10.7B - V2
|
10 |
|
11 |
-
|
|
|
|
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
|
15 |
|
16 |
The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
|
|
|
17 |
The following filters were applied to the original dataset:
|
18 |
```python
|
19 |
dataset = dataset.filter(
|
@@ -25,7 +37,7 @@ dataset = dataset.filter(
|
|
25 |
```
|
26 |
|
27 |
# Chat Template
|
28 |
-
|
29 |
|
30 |
```
|
31 |
<|im_start|>GPT4 Correct system:
|
@@ -36,9 +48,10 @@ I decided to go with a slight modification of ChatML.
|
|
36 |
{asistant}<|im_end|>
|
37 |
```
|
38 |
|
39 |
-
### Training
|
|
|
|
|
40 |
|
41 |
-
I used 1 x A100 80GB in runpod for about 1.5 hours.
|
42 |
|
43 |
## Usage
|
44 |
|
@@ -83,11 +96,9 @@ print(sequences[0]['generated_text'])
|
|
83 |
|
84 |
## Acknowledgements
|
85 |
|
86 |
-
|
87 |
|
88 |
* The Intel team for publishing a great open dataset and show how well it worked in the first place
|
89 |
* Teknium and NousResearch for their awesome work and models.
|
90 |
* Maxime for sharing such great resources.
|
91 |
-
* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs
|
92 |
-
|
93 |
-
|
|
|
6 |
pipeline_tag: text-generation
|
7 |
---
|
8 |
|
9 |
+
# Chikuma_10.7B - V2 (Enhanced with DPO)
|
10 |
|
11 |
+
<p align="center">
|
12 |
+
<img src="https://huggingface.co/sethuiyer/distilabled_Chikuma_10.7B/resolve/main/chikuma_v2.webp" height="256px" alt="Chikuma">
|
13 |
+
</p>
|
14 |
|
15 |
+
|
16 |
+
This model is the **DPO fine tuned version** of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B), which was a depth upscaled merge of:
|
17 |
+
* [sethuiyer/SynthIQ-7b](https://huggingface.co/sethuiyer/SynthIQ-7b)
|
18 |
+
* [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
|
19 |
+
|
20 |
+
The name "Chikuma" is inspired by the [Chikuma River](https://en.wikipedia.org/wiki/Shinano_River), the longest in Japan, known for its continuous flow and meandering path.
|
21 |
+
This metaphorically represents the model's depth, fluidity, and adaptability in processing and understanding language.
|
22 |
+
|
23 |
+
|
24 |
+
# Dataset used for Fine Tuning
|
25 |
Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
|
26 |
|
27 |
The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
|
28 |
+
|
29 |
The following filters were applied to the original dataset:
|
30 |
```python
|
31 |
dataset = dataset.filter(
|
|
|
37 |
```
|
38 |
|
39 |
# Chat Template
|
40 |
+
The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimized for improved interaction and engagement:
|
41 |
|
42 |
```
|
43 |
<|im_start|>GPT4 Correct system:
|
|
|
48 |
{asistant}<|im_end|>
|
49 |
```
|
50 |
|
51 |
+
### Training Environment
|
52 |
+
- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
|
53 |
+
- Training Script: Accessible via [Google Colab Notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing). Special thanks to [mlabonne](https://huggingface.co/mlabonne) for providing the template.
|
54 |
|
|
|
55 |
|
56 |
## Usage
|
57 |
|
|
|
96 |
|
97 |
## Acknowledgements
|
98 |
|
99 |
+
A heartfelt appreciation goes to the vibrant open-source community, particularly:
|
100 |
|
101 |
* The Intel team for publishing a great open dataset and show how well it worked in the first place
|
102 |
* Teknium and NousResearch for their awesome work and models.
|
103 |
* Maxime for sharing such great resources.
|
104 |
+
* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs
|
|
|
|