File size: 2,357 Bytes
097c794
 
 
 
d3a86e2
619d7e8
 
 
d3a86e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
619d7e8
d3a86e2
 
 
 
 
 
 
619d7e8
536522b
 
 
 
 
 
d3a86e2
 
 
 
 
 
 
536522b
d3a86e2
 
536522b
d3a86e2
 
536522b
d3a86e2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Usage
- Metharme format (Mistral works too but untested)

# Upscaled Tuning Experiment Write Up Thingy

## What is the 39B Upscale?

https://huggingface.co/TheSkullery/BA-Zephyria-39b
```yaml
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 41]
    model: unsloth/Mistral-Small-Instruct-2409
- sources:
  - layer_range: [19, 41]
    model: unsloth/Mistral-Small-Instruct-2409
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [19, 41]
    model: unsloth/Mistral-Small-Instruct-2409
    parameters:
      scale:
      - filter: o_proj
        value: 0.0
      - filter: down_proj
        value: 0.0
      - value: 1.0
- sources:
  - layer_range: [41, 55]
    model: unsloth/Mistral-Small-Instruct-2409
```

- Layers 0 to 18 are original
- Layers 19 to 41 are duplicated, zero'd out, and put in the middle twice
- Layers 42 to 54 are original
- **down_proj** and **o_proj** layers for the duplicated part have been nulled and will require healing to 'unignore' the added layers

```
[    Unique    ][    Duplicated    ][    Unique    ]
0 ----------- 18 19 ------------ 41 42 ---------- 54
     34.5%           41.8%            23.7%
```

# Weight Difference Visualization
- Nemo x Rocinante
- Small x Cydonia
- 39B Upscale x Tunguska 1 Epoch
- 39B Upscale x Tunguska 2 Epochs
- Tunguska 1 Epoch x Tunguska 2 Epochs

## Control Sample A (Nemo & Rocinante, similar training)
*Also note the layer sequence and other labels since it will be unreadable for the 39B*
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/EZN8Ci2_vAGmdq0WUyrpN.png)

## Control Sample B (Small & Cydonia, similar training)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/xdH_7fy9HuhSzaSE2-h4X.png)

## Tunguska 39B 1 Epoch vs. its base
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/X3-bHyQg03-QvZFvOhGp7.png)

## Tunguska 39B 2 Epochs vs. its base
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/-dRSeXmPXdE3_g67iKT0K.png)

## Tunguska 39B 1 Epoch vs 2 Epochs
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/cjKf37TrSJHmq0S0_PZyE.png)