Update README.md
Browse files
README.md
CHANGED
@@ -1,20 +1,48 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
tags:
|
4 |
-
- mergekit
|
5 |
-
- merge
|
6 |
-
|
7 |
---
|
8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
This is a passthrough of arco with an experimental model. As you can see, it dramatically improved on arc challenge, only missing 1.2 points to get to the level of modern 3b baseline performance.
|
11 |
|
12 |
If you prefer multilingual, general knowledge, chose qwen. If you prefer solving simple english tasks, chose arco.
|
13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
|
15 |
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
|
16 |
| 0.5b | qwen2 |44.13| 28.92| 49.05 | 69.31 | 56.99 | 49.68 |
|
17 |
| 0.5b | arco (original) |24.41 | 38.23 | 59.21 | 74.27 | 59.59 | 51.14 |
|
18 |
| 0.5b | qwen2.5 |**47.29**|31.83|52.17|70.29|57.06|51.72|
|
19 |
| 0.5b | arco |26.17|37.29|62.88|74.37|**62.27**|52.60|
|
20 |
-
| 0.5b | arco 2 |25.51|**38.82**|**63.02**|**74.70**|61.25|**52.66**|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
+
|
5 |
+
|
6 |
+
<style>
|
7 |
+
img{
|
8 |
+
user-select: none;
|
9 |
+
transition: all 0.2s ease;
|
10 |
+
border-radius: .5rem;
|
11 |
+
}
|
12 |
+
img:hover{
|
13 |
+
transform: rotate(2deg);
|
14 |
+
filter: invert(100%);
|
15 |
+
}
|
16 |
+
@import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap');
|
17 |
+
</style>
|
18 |
+
|
19 |
+
<div style="background-color: transparent; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;">
|
20 |
+
|
21 |
+
![cubby](https://huggingface.co/appvoid/cubby/resolve/main/cubby.webp)
|
22 |
|
23 |
This is a passthrough of arco with an experimental model. As you can see, it dramatically improved on arc challenge, only missing 1.2 points to get to the level of modern 3b baseline performance.
|
24 |
|
25 |
If you prefer multilingual, general knowledge, chose qwen. If you prefer solving simple english tasks, chose arco.
|
26 |
|
27 |
+
#### prompt
|
28 |
+
|
29 |
+
there is no prompt intentionally set.
|
30 |
+
|
31 |
+
|
32 |
+
#### benchmarks
|
33 |
+
|
34 |
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
|
35 |
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
|
36 |
| 0.5b | qwen2 |44.13| 28.92| 49.05 | 69.31 | 56.99 | 49.68 |
|
37 |
| 0.5b | arco (original) |24.41 | 38.23 | 59.21 | 74.27 | 59.59 | 51.14 |
|
38 |
| 0.5b | qwen2.5 |**47.29**|31.83|52.17|70.29|57.06|51.72|
|
39 |
| 0.5b | arco |26.17|37.29|62.88|74.37|**62.27**|52.60|
|
40 |
+
| 0.5b | arco 2 |25.51|**38.82**|**63.02**|**74.70**|61.25|**52.66**|
|
41 |
+
#### supporters
|
42 |
+
|
43 |
+
<a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 34px !important; margin-top: -4px;width: 128px !important; filter: contrast(2) grayscale(100%) brightness(100%);" ></a>
|
44 |
+
|
45 |
+
### trivia
|
46 |
+
|
47 |
+
arco also means "arc optimized" hence the focus on this cognition benchmark.
|
48 |
+
</div>
|