|
# Usage |
|
- Metharme format (Mistral works too but untested) |
|
|
|
# Upscaled Tuning Experiment Write Up Thingy |
|
|
|
## What is the 39B Upscale? |
|
|
|
https://huggingface.co/TheSkullery/BA-Zephyria-39b |
|
```yaml |
|
merge_method: passthrough |
|
slices: |
|
- sources: |
|
- layer_range: [0, 41] |
|
model: unsloth/Mistral-Small-Instruct-2409 |
|
- sources: |
|
- layer_range: [19, 41] |
|
model: unsloth/Mistral-Small-Instruct-2409 |
|
parameters: |
|
scale: |
|
- filter: o_proj |
|
value: 0.0 |
|
- filter: down_proj |
|
value: 0.0 |
|
- value: 1.0 |
|
- sources: |
|
- layer_range: [19, 41] |
|
model: unsloth/Mistral-Small-Instruct-2409 |
|
parameters: |
|
scale: |
|
- filter: o_proj |
|
value: 0.0 |
|
- filter: down_proj |
|
value: 0.0 |
|
- value: 1.0 |
|
- sources: |
|
- layer_range: [41, 55] |
|
model: unsloth/Mistral-Small-Instruct-2409 |
|
``` |
|
|
|
- Layers 0 to 18 are original |
|
- Layers 19 to 41 are duplicated, zero'd out, and put in the middle twice |
|
- Layers 42 to 54 are original |
|
- **down_proj** and **o_proj** layers for the duplicated part have been nulled and will require healing to 'unignore' the added layers |
|
|
|
``` |
|
[ Unique ][ Duplicated ][ Unique ] |
|
0 ----------- 18 19 ------------ 41 42 ---------- 54 |
|
34.5% 41.8% 23.7% |
|
``` |
|
|
|
# Weight Difference Visualization |
|
- Nemo x Rocinante |
|
- Small x Cydonia |
|
- 39B Upscale x Tunguska 1 Epoch |
|
- 39B Upscale x Tunguska 2 Epochs |
|
- Tunguska 1 Epoch x Tunguska 2 Epochs |
|
|
|
## Control Sample A (Nemo & Rocinante, similar training) |
|
*Also note the layer sequence and other labels since it will be unreadable for the 39B* |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/EZN8Ci2_vAGmdq0WUyrpN.png) |
|
|
|
## Control Sample B (Small & Cydonia, similar training) |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/xdH_7fy9HuhSzaSE2-h4X.png) |
|
|
|
## Tunguska 39B 1 Epoch vs. its base |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/X3-bHyQg03-QvZFvOhGp7.png) |
|
|
|
## Tunguska 39B 2 Epochs vs. its base |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/-dRSeXmPXdE3_g67iKT0K.png) |
|
|
|
## Tunguska 39B 1 Epoch vs 2 Epochs |
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/cjKf37TrSJHmq0S0_PZyE.png) |
|
|