File size: 8,561 Bytes
da98e9f
 
 
 
 
 
 
 
 
341c294
 
6738548
 
341c294
6738548
 
 
4cb9d01
6738548
33d593d
 
 
 
 
 
 
 
341c294
 
 
 
 
 
 
6738548
341c294
 
 
 
 
 
 
 
 
 
 
 
da98e9f
341c294
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da98e9f
341c294
 
 
 
 
 
 
 
 
 
 
 
 
 
6738548
341c294
 
 
 
 
6738548
341c294
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da98e9f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
341c294
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
base_model:
- beomi/Llama-3-Open-Ko-8B
- meta-llama/Meta-Llama-3-8B-Instruct
- meta-llama/Meta-Llama-3-8B
library_name: transformers
tags:
- mergekit
- merge
license: other
license_name: llama3
language:
- ko
---
# Llama-3-Ko-8B-dare-ties

This is the series of 'Base + Language + Instruct', chat vector and various methods in mergekit.
Thanks again! @beomi

| Model | Merge Method | Score(but what?) |
|---|---|---|
| [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview) | chat vector | - |
| [kuotient/Llama-3-Ko-8B-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-ties) | Ties | - |
| [kuotient/Llama-3-Ko-8B-dare-ties](https://huggingface.co/kuotient/Llama-3-Ko-8B-dare-ties) | Dare-ties | - |
| [kuotient/Llama-3-Ko-8B-TA](https://huggingface.co/kuotient/Llama-3-Ko-8B-TA) | Task Arithmetic(maybe...? not sure about this) | - |
| WIP | Model stock(I don't read this paper yet but still) | - |

The original paper author claims density should be around 0.2~0.3, but in reality high number gets some better result. You should try other params for better result than this!

```python
messages = [
    {"role": "system", "content": "μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜."},
    {"role": "user", "content": "ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ΄ 뭐야? 그리고 ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ— λŒ€ν•΄ 파이썬 μ½”λ“œλ₯Ό 짜쀘볼래?"},
]
---
ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ€ μˆ˜ν•™μ—μ„œ κ°€μž₯ 유λͺ…ν•œ μˆ˜μ—΄ 쀑 ν•˜λ‚˜μž…λ‹ˆλ‹€. ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ€ λ‹€μŒκ³Ό 같은 κ·œμΉ™μœΌλ‘œ μƒμ„±λ©λ‹ˆλ‹€.

1. 첫 번째 μˆ˜λŠ” 0, 두 번째 μˆ˜λŠ” 1
2. μ΄ν›„μ˜ μˆ˜λŠ” 이전 두 수의 ν•©μœΌλ‘œ 계산

예λ₯Ό λ“€μ–΄, ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ˜ 첫 10개 μˆ˜λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55,...

이 μˆ˜μ—΄μ€ 맀우 λΉ λ₯΄κ²Œ 컀지기 λ•Œλ¬Έμ—, ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ„ κ³„μ‚°ν•˜λŠ” λ°λŠ” 일반적으둜 μ§€μˆ˜ ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.

파이썬 μ½”λ“œλ₯Ό μž‘μ„±ν•˜λ©΄ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

`(포맷이 κΉ¨μ Έμ„œ..πŸ€—)``
def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

for i in range(10):
    print(fibonacci(i))
`πŸ€—``

이 μ½”λ“œλŠ” ν”Όλ³΄λ‚˜μΉ˜ μˆ˜μ—΄μ˜ 첫 10개 수λ₯Ό κ³„μ‚°ν•©λ‹ˆλ‹€. κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

0
1
1
2
3
5
8
13
21
```

```python
messages = [
    {"role": "system", "content": "μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜."},
    {"role": "user", "content": "ν•œκ΅­μ˜ μ œν—Œ ν—Œλ²•μ— λŒ€ν•΄μ„œ μžμ„Ένžˆ μ„€λͺ…ν•΄μ£Όμ„Έμš”."},
]
---
ν•œκ΅­μ˜ μ œν—Œ ν—Œλ²•μ€ 1948λ…„ 7μ›” 17일 μ œν—Œκ΅­νšŒμ—μ„œ μ˜κ²°ν•˜μ—¬ 1948λ…„ 7μ›” 27일 곡포된 λŒ€ν•œλ―Όκ΅­ ν—Œλ²•μ˜ μ΄ˆμ•ˆμž…λ‹ˆλ‹€. μ œν—Œ ν—Œλ²•μ€ λŒ€ν•œλ―Όκ΅­μ˜ ν—Œλ²•μ  κΈ°λ³Έμ§ˆμ„œλ₯Ό κ·œμ •ν•˜λŠ” λ¬Έμ„œλ‘œ, λŒ€ν•œλ―Όκ΅­μ˜ ꡭ호, μ˜ν† , κ΅­λ―Ό, κ΅­κ°€μ˜ ν˜•νƒœ, μ •λΆ€μ˜ 쑰직, ꡭ민의 κΆŒλ¦¬μ™€ 의무, 법λ₯ μ˜ μ œμ • 등에 λŒ€ν•΄ κ·œμ •ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. μ œν—Œ ν—Œλ²•μ€ 1948λ…„ 7μ›” 17일 μ œν—Œκ΅­νšŒμ—μ„œ μ˜κ²°λ˜μ–΄ 1948λ…„ 7μ›” 27일 곡포된 이후 1952λ…„ 12μ›” 5일뢀터 1962λ…„ 12μ›” 5μΌκΉŒμ§€ μ μš©λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

μ œν—Œ ν—Œλ²•μ˜ μ£Όμš” λ‚΄μš©μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

1. ꡭ호: λŒ€ν•œλ―Όκ΅­
2. μ˜ν† : λŒ€ν•œλ―Όκ΅­μ˜ μ˜ν† λŠ” 38도선 이뢁과 μ΄λ‚¨μœΌλ‘œ λ‚˜λˆ„μ–΄μ Έ 있으며, 38도선 이뢁의 μ˜ν† λŠ” λŒ€ν•œλ―Όκ΅­μ˜ μ˜ν† λ‘œ μΈμ •ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
3. κ΅­λ―Ό: λŒ€ν•œλ―Όκ΅­μ˜ ꡭ민은 λͺ¨λ“  μ‚¬λžŒμœΌλ‘œ, ꡭ적을 가진 μ‚¬λžŒκ³Ό ꡭ적을 가진 μ‚¬λžŒμ˜ ν›„μ†μœΌλ‘œ κ΅¬μ„±λ©λ‹ˆλ‹€.
4. κ΅­κ°€μ˜ ν˜•νƒœ: λŒ€ν•œλ―Όκ΅­μ€ λ―Όμ£Όκ³΅ν™”κ΅­μœΌλ‘œ, λŒ€ν†΅λ Ήμ΄ κ΅­κ°€μ˜ μˆ˜λ°˜μœΌλ‘œμ„œ ν†΅μΉ˜ν•©λ‹ˆλ‹€.
5. μ •λΆ€μ˜ 쑰직: λŒ€ν†΅λ Ή, ꡭ무총리, κ΅­λ¬΄μœ„μ› λ“±μœΌλ‘œ κ΅¬μ„±λ˜λŠ” μ •λΆ€λ₯Ό 두고 μžˆμŠ΅λ‹ˆλ‹€.
6. ꡭ민의 κΆŒλ¦¬μ™€ 의무: λŒ€ν•œλ―Όκ΅­μ˜ ꡭ민은 ν‰λ“±ν•œ ꢌ리λ₯Ό 가지며, ꡭ적, 인ꢌ, μž¬μ‚°κΆŒ, ν‘œν˜„μ˜ 자유 λ“± 기본적 ꢌ리λ₯Ό 보μž₯λ°›μŠ΅λ‹ˆλ‹€.
7. 법λ₯ μ˜ μ œμ •: λŒ€ν•œλ―Όκ΅­μ˜ 법λ₯ μ€ κ΅­νšŒμ—μ„œ μ œμ •ν•˜κ³ , λŒ€ν†΅λ Ήμ΄ κ³΅ν¬ν•©λ‹ˆλ‹€.

이와 같은 μ œν—Œ ν—Œλ²•μ€ λŒ€ν•œλ―Όκ΅­μ˜ ν—Œλ²•μ  κΈ°λ³Έμ§ˆμ„œλ₯Ό κ·œμ •ν•˜λŠ” λ¬Έμ„œλ‘œ, λŒ€ν•œλ―Όκ΅­μ˜ ꡭ호, μ˜ν† , κ΅­λ―Ό, κ΅­κ°€μ˜ ν˜•νƒœ, μ •λΆ€μ˜ 쑰직, ꡭ민의 κΆŒλ¦¬μ™€ 의무, 법λ₯ μ˜ μ œμ • 등에 λŒ€ν•΄ κ·œμ •ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
```
Repetitive results, but I **really really** like this output tho
```python
messages = [
    {"role": "system", "content": "μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜."},
    {"role": "user", "content": "μ„Έμ’…λŒ€μ™•μ˜ μ§‘ν˜„μ „ λ§₯뢁 던짐 사건에 λŒ€ν•΄μ„œ μ•Œκ³  μžˆμ–΄?"},
]
---
λ„€, μ„Έμ’…λŒ€μ™•μ΄ μ§‘ν˜„μ „μ—μ„œ λ§₯뢁을 λ˜μ‘Œλ‹€λŠ” μ΄μ•ΌκΈ°λŠ” 사싀이 μ•„λ‹™λ‹ˆλ‹€. 이 μ΄μ•ΌκΈ°λŠ” μ„Έμ’…λŒ€μ™•μ΄ ν›ˆλ―Όμ •μŒ 창제 λ‹Ήμ‹œμ˜ μΌν™”λ‘œ, λ‹Ήμ‹œμ˜ ν•™μžλ“€μ΄ ν›ˆλ―Όμ •μŒμ˜ μ°½μ œμ— λ°˜λŒ€ν•˜λŠ” μ˜κ²¬μ„ λ‚΄μ„Έμ›Œ μ„Έμ’…λŒ€μ™•μ—κ²Œ ν•­μ˜λ₯Ό ν•˜μž, μ„Έμ’…λŒ€μ™•μ΄ 그듀을 μ§ˆμ±…ν•˜λ©΄μ„œ ν›ˆλ―Όμ •μŒ 창제λ₯Ό κ°•μ‘°ν•œ μΌν™”λ‘œ μ „ν•΄μ§‘λ‹ˆλ‹€. ν•˜μ§€λ§Œ 이 μΌν™”λŠ” μ‹€μ œλ‘œ μΌμ–΄λ‚œ 일은 μ•„λ‹ˆλ©°, ν›ˆλ―Όμ •μŒ μ°½μ œμ— λ°˜λŒ€ν•˜λŠ” ν•™μžλ“€μ€ μ‹€μ œλ‘œ μ—†μ—ˆμœΌλ©°, μ„Έμ’…λŒ€μ™•λ„ ν›ˆλ―Όμ •μŒ 창제λ₯Ό κ°•μ‘°ν•œ 적은 μ—†μŠ΅λ‹ˆλ‹€. 이 μ΄μ•ΌκΈ°λŠ” ν›„λŒ€μ— λ§Œλ“€μ–΄μ§„ κ²ƒμœΌλ‘œ μΆ”μ •λ©λ‹ˆλ‹€. 

μ„Έμ’…λŒ€μ™•μ€ ν›ˆλ―Όμ •μŒ μ°½μ œμ— λŒ€ν•΄ 맀우 μ—΄μ˜λ₯Ό λ‚΄λ³΄μ˜€κ³ , ν›ˆλ―Όμ •μŒμ΄ 창제되기 μ „μ—λŠ” ν•œκΈ€μ„ μ‚¬μš©ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. ν›ˆλ―Όμ •μŒμ΄ 창제된 ν›„μ—λŠ” ν›ˆλ―Όμ •μŒμ— κΈ°λ°˜ν•œ ν•œκΈ€μ„ μ‚¬μš©ν–ˆμŠ΅λ‹ˆλ‹€. ν›ˆλ―Όμ •μŒμ€ 1443λ…„λΆ€ν„° 1446λ…„κΉŒμ§€ 4λ…„ λ™μ•ˆμ— 걸쳐 μ°½μ œλ˜μ—ˆμŠ΅λ‹ˆλ‹€. 이 μ‹œκΈ°μ— μ„Έμ’…λŒ€μ™•μ€ ν›ˆλ―Όμ •μŒμ˜ μ°½μ œμ— 맀우 관심을 가지고 μžˆμ—ˆμœΌλ©°, ν›ˆλ―Όμ •μŒμ˜ 창제λ₯Ό μœ„ν•΄ ν•™μžλ“€μ„ 뢈러 λͺ¨μ•„ ν›ˆλ―Όμ •μŒμ˜ μ°½μ œμ— μ°Έμ—¬ν•˜λ„λ‘ ν–ˆμŠ΅λ‹ˆλ‹€. ν›ˆλ―Όμ •μŒμ΄ 창제된 ν›„μ—λŠ” ν›ˆλ―Όμ •μŒμ— κΈ°λ°˜ν•œ ν•œκΈ€μ„ μ‚¬μš©ν•˜κ³ , ν•œκΈ€μ„ μ‚¬μš©ν•˜λŠ” 것을 μž₯λ €ν–ˆμŠ΅λ‹ˆλ‹€. 

μ„Έμ’…λŒ€μ™•μ€ ν›ˆλ―Όμ •μŒμ˜ 창제λ₯Ό 톡해 ν•œκΈ€μ„ μ‚¬μš©ν•˜κ²Œ 되면, λͺ¨λ“  μ‚¬λžŒλ“€μ΄ μ‰½κ²Œ 읽고 μ“°λŠ” 글을 μ‚¬μš©ν•  수 있게 되고, μ΄λŠ” λ‚˜λΌμ˜ λ°œμ „μ„ μœ„ν•΄ 맀우 μ€‘μš”ν•œ 일이라고 μƒκ°ν–ˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ, ν›ˆλ―Όμ •μŒμ˜ 창제λ₯Ό 톡해 ν•œκΈ€μ„ μ‚¬μš©ν•˜κ²Œ 되면, λ‚˜λΌμ˜ 톡일을 μœ„ν•΄ 맀우 μ€‘μš”ν•œ 일이라고 μƒκ°ν–ˆμŠ΅λ‹ˆλ‹€. 이에 따라 μ„Έμ’…λŒ€μ™•μ€ ν›ˆλ―Όμ •μŒμ˜ μ°½μ œμ— λ§Žμ€ λ…Έλ ₯을 κΈ°μšΈμ˜€μŠ΅λ‹ˆλ‹€. 

μ„Έμ’…λŒ€μ™•μ΄ ν›ˆλ―Όμ •μŒμ˜ μ°½μ œμ— 관심을 κ°€μ‘Œλ‹€λŠ” 것은 μ—­μ‚¬μ μœΌλ‘œ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. ν•˜μ§€λ§Œ μ„Έμ’…λŒ€μ™•
```

```python
messages = [
    {"role": "system", "content": "μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜."},
    {"role": "user", "content": "λ‚˜λŠ” 였늘 사과 5개λ₯Ό κ°–κ³ μžˆμ–΄. 그리고 μ–΄μ œ λ‚˜λŠ” 사과 2개λ₯Ό λ¨Ήμ—ˆμ§€. λ‚΄κ°€ μ§€κΈˆ 가진 μ‚¬κ³ΌλŠ” λͺ‡ κ°œμ§€?"},
]
---
λ„€, 였늘 사과 5개 가지고 있고 μ–΄μ œ 2개 λ¨Ήμ—ˆμœΌλ‹ˆ, μ§€κΈˆ 가지고 μžˆλŠ” μ‚¬κ³ΌλŠ” 5κ°œμ—μ„œ 2개λ₯Ό λΊ€ 3κ°œμž…λ‹ˆλ‹€! πŸŽπŸ‘

(이 κ²½μš°μ—λŠ” 5κ°œμ—μ„œ 2개λ₯Ό λΊ€ 결과인 3κ°œκ°€ μ •λ‹΅μž…λ‹ˆλ‹€.)
```

# Llama-3-Ko-8B-dare-ties

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) as a base.

### Models Merged

The following models were included in the merge:
* [beomi/Llama-3-Open-Ko-8B](https://huggingface.co/beomi/Llama-3-Open-Ko-8B)
* [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: meta-llama/Meta-Llama-3-8B
    # no parameters necessary for base model
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.53
      weight: 0.5
  - model: beomi/Llama-3-Open-Ko-8B
    parameters:
      density: 0.53
      weight: 0.5
merge_method: dare_ties
base_model: meta-llama/Meta-Llama-3-8B
dtype: bfloat16
```