pranavajay commited on
Commit
b02e39c
·
verified ·
1 Parent(s): d7c83e8

Upload log.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. log.txt +0 -488
log.txt CHANGED
@@ -1,488 +0,0 @@
1
- Tensor 'context_embedder.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
2
- Tensor 'context_embedder.weight' has different shapes: Model 1: torch.Size([2150, 4096]), Model 2: torch.Size([3072, 4096])
3
- Tensor 'time_text_embed.guidance_embedder.linear_1.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
4
- Tensor 'time_text_embed.guidance_embedder.linear_1.weight' has different shapes: Model 1: torch.Size([2150, 256]), Model 2: torch.Size([3072, 256])
5
- Tensor 'time_text_embed.guidance_embedder.linear_2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
6
- Tensor 'time_text_embed.guidance_embedder.linear_2.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
7
- Tensor 'time_text_embed.text_embedder.linear_1.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
8
- Tensor 'time_text_embed.text_embedder.linear_1.weight' has different shapes: Model 1: torch.Size([2150, 768]), Model 2: torch.Size([3072, 768])
9
- Tensor 'time_text_embed.text_embedder.linear_2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
10
- Tensor 'time_text_embed.text_embedder.linear_2.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
11
- Tensor 'time_text_embed.timestep_embedder.linear_1.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
12
- Tensor 'time_text_embed.timestep_embedder.linear_1.weight' has different shapes: Model 1: torch.Size([2150, 256]), Model 2: torch.Size([3072, 256])
13
- Tensor 'time_text_embed.timestep_embedder.linear_2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
14
- Tensor 'time_text_embed.timestep_embedder.linear_2.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
15
- Tensor 'transformer_blocks.0.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
16
- Tensor 'transformer_blocks.0.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
17
- Tensor 'transformer_blocks.0.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
18
- Tensor 'transformer_blocks.0.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
19
- Tensor 'transformer_blocks.0.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
20
- Tensor 'transformer_blocks.0.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
21
- Tensor 'transformer_blocks.0.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
22
- Tensor 'transformer_blocks.0.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
23
- Tensor 'transformer_blocks.0.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
24
- Tensor 'transformer_blocks.0.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
25
- Tensor 'transformer_blocks.0.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
26
- Tensor 'transformer_blocks.0.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
27
- Tensor 'transformer_blocks.0.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
28
- Tensor 'transformer_blocks.0.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
29
- Tensor 'transformer_blocks.0.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
30
- Tensor 'transformer_blocks.0.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
31
- Tensor 'transformer_blocks.0.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
32
- Tensor 'transformer_blocks.0.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
33
- Tensor 'transformer_blocks.0.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
34
- Tensor 'transformer_blocks.0.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
35
- Tensor 'transformer_blocks.0.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
36
- Tensor 'transformer_blocks.0.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
37
- Tensor 'transformer_blocks.0.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
38
- Tensor 'transformer_blocks.0.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
39
- Tensor 'transformer_blocks.0.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
40
- Tensor 'transformer_blocks.0.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
41
- Tensor 'transformer_blocks.0.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
42
- Tensor 'transformer_blocks.0.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
43
- Tensor 'transformer_blocks.0.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
44
- Tensor 'transformer_blocks.0.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
45
- Tensor 'transformer_blocks.0.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
46
- Tensor 'transformer_blocks.0.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
47
- Tensor 'transformer_blocks.1.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
48
- Tensor 'transformer_blocks.1.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
49
- Tensor 'transformer_blocks.1.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
50
- Tensor 'transformer_blocks.1.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
51
- Tensor 'transformer_blocks.1.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
52
- Tensor 'transformer_blocks.1.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
53
- Tensor 'transformer_blocks.1.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
54
- Tensor 'transformer_blocks.1.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
55
- Tensor 'transformer_blocks.1.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
56
- Tensor 'transformer_blocks.1.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
57
- Tensor 'transformer_blocks.1.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
58
- Tensor 'transformer_blocks.1.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
59
- Tensor 'transformer_blocks.1.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
60
- Tensor 'transformer_blocks.1.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
61
- Tensor 'transformer_blocks.1.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
62
- Tensor 'transformer_blocks.1.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
63
- Tensor 'transformer_blocks.1.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
64
- Tensor 'transformer_blocks.1.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
65
- Tensor 'transformer_blocks.1.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
66
- Tensor 'transformer_blocks.1.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
67
- Tensor 'transformer_blocks.1.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
68
- Tensor 'transformer_blocks.1.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
69
- Tensor 'transformer_blocks.1.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
70
- Tensor 'transformer_blocks.1.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
71
- Tensor 'transformer_blocks.1.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
72
- Tensor 'transformer_blocks.1.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
73
- Tensor 'transformer_blocks.1.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
74
- Tensor 'transformer_blocks.1.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
75
- Tensor 'transformer_blocks.1.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
76
- Tensor 'transformer_blocks.1.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
77
- Tensor 'transformer_blocks.1.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
78
- Tensor 'transformer_blocks.1.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
79
- Tensor 'transformer_blocks.10.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
80
- Tensor 'transformer_blocks.10.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
81
- Tensor 'transformer_blocks.10.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
82
- Tensor 'transformer_blocks.10.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
83
- Tensor 'transformer_blocks.10.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
84
- Tensor 'transformer_blocks.10.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
85
- Tensor 'transformer_blocks.10.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
86
- Tensor 'transformer_blocks.10.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
87
- Tensor 'transformer_blocks.10.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
88
- Tensor 'transformer_blocks.10.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
89
- Tensor 'transformer_blocks.10.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
90
- Tensor 'transformer_blocks.10.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
91
- Tensor 'transformer_blocks.10.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
92
- Tensor 'transformer_blocks.10.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
93
- Tensor 'transformer_blocks.10.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
94
- Tensor 'transformer_blocks.10.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
95
- Tensor 'transformer_blocks.10.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
96
- Tensor 'transformer_blocks.10.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
97
- Tensor 'transformer_blocks.10.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
98
- Tensor 'transformer_blocks.10.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
99
- Tensor 'transformer_blocks.10.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
100
- Tensor 'transformer_blocks.10.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
101
- Tensor 'transformer_blocks.10.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
102
- Tensor 'transformer_blocks.10.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
103
- Tensor 'transformer_blocks.10.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
104
- Tensor 'transformer_blocks.10.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
105
- Tensor 'transformer_blocks.10.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
106
- Tensor 'transformer_blocks.10.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
107
- Tensor 'transformer_blocks.10.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
108
- Tensor 'transformer_blocks.10.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
109
- Tensor 'transformer_blocks.10.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
110
- Tensor 'transformer_blocks.10.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
111
- Tensor 'transformer_blocks.11.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
112
- Tensor 'transformer_blocks.11.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
113
- Tensor 'transformer_blocks.11.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
114
- Tensor 'transformer_blocks.11.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
115
- Tensor 'transformer_blocks.11.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
116
- Tensor 'transformer_blocks.11.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
117
- Tensor 'transformer_blocks.11.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
118
- Tensor 'transformer_blocks.11.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
119
- Tensor 'transformer_blocks.11.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
120
- Tensor 'transformer_blocks.11.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
121
- Tensor 'transformer_blocks.11.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
122
- Tensor 'transformer_blocks.11.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
123
- Tensor 'transformer_blocks.11.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
124
- Tensor 'transformer_blocks.11.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
125
- Tensor 'transformer_blocks.11.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
126
- Tensor 'transformer_blocks.11.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
127
- Tensor 'transformer_blocks.11.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
128
- Tensor 'transformer_blocks.11.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
129
- Tensor 'transformer_blocks.11.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
130
- Tensor 'transformer_blocks.11.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
131
- Tensor 'transformer_blocks.11.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
132
- Tensor 'transformer_blocks.11.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
133
- Tensor 'transformer_blocks.11.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
134
- Tensor 'transformer_blocks.11.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
135
- Tensor 'transformer_blocks.11.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
136
- Tensor 'transformer_blocks.11.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
137
- Tensor 'transformer_blocks.11.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
138
- Tensor 'transformer_blocks.11.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
139
- Tensor 'transformer_blocks.11.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
140
- Tensor 'transformer_blocks.11.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
141
- Tensor 'transformer_blocks.11.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
142
- Tensor 'transformer_blocks.11.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
143
- Tensor 'transformer_blocks.12.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
144
- Tensor 'transformer_blocks.12.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
145
- Tensor 'transformer_blocks.12.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
146
- Tensor 'transformer_blocks.12.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
147
- Tensor 'transformer_blocks.12.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
148
- Tensor 'transformer_blocks.12.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
149
- Tensor 'transformer_blocks.12.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
150
- Tensor 'transformer_blocks.12.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
151
- Tensor 'transformer_blocks.12.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
152
- Tensor 'transformer_blocks.12.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
153
- Tensor 'transformer_blocks.12.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
154
- Tensor 'transformer_blocks.12.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
155
- Tensor 'transformer_blocks.12.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
156
- Tensor 'transformer_blocks.12.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
157
- Tensor 'transformer_blocks.12.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
158
- Tensor 'transformer_blocks.12.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
159
- Tensor 'transformer_blocks.12.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
160
- Tensor 'transformer_blocks.12.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
161
- Tensor 'transformer_blocks.12.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
162
- Tensor 'transformer_blocks.12.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
163
- Tensor 'transformer_blocks.12.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
164
- Tensor 'transformer_blocks.12.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
165
- Tensor 'transformer_blocks.12.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
166
- Tensor 'transformer_blocks.12.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
167
- Tensor 'transformer_blocks.12.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
168
- Tensor 'transformer_blocks.12.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
169
- Tensor 'transformer_blocks.12.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
170
- Tensor 'transformer_blocks.12.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
171
- Tensor 'transformer_blocks.12.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
172
- Tensor 'transformer_blocks.12.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
173
- Tensor 'transformer_blocks.12.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
174
- Tensor 'transformer_blocks.12.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
175
- Tensor 'transformer_blocks.13.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
176
- Tensor 'transformer_blocks.13.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
177
- Tensor 'transformer_blocks.13.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
178
- Tensor 'transformer_blocks.13.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
179
- Tensor 'transformer_blocks.13.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
180
- Tensor 'transformer_blocks.13.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
181
- Tensor 'transformer_blocks.13.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
182
- Tensor 'transformer_blocks.13.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
183
- Tensor 'transformer_blocks.13.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
184
- Tensor 'transformer_blocks.13.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
185
- Tensor 'transformer_blocks.13.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
186
- Tensor 'transformer_blocks.13.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
187
- Tensor 'transformer_blocks.13.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
188
- Tensor 'transformer_blocks.13.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
189
- Tensor 'transformer_blocks.13.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
190
- Tensor 'transformer_blocks.13.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
191
- Tensor 'transformer_blocks.13.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
192
- Tensor 'transformer_blocks.13.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
193
- Tensor 'transformer_blocks.13.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
194
- Tensor 'transformer_blocks.13.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
195
- Tensor 'transformer_blocks.13.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
196
- Tensor 'transformer_blocks.13.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
197
- Tensor 'transformer_blocks.13.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
198
- Tensor 'transformer_blocks.13.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
199
- Tensor 'transformer_blocks.13.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
200
- Tensor 'transformer_blocks.13.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
201
- Tensor 'transformer_blocks.13.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
202
- Tensor 'transformer_blocks.13.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
203
- Tensor 'transformer_blocks.13.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
204
- Tensor 'transformer_blocks.13.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
205
- Tensor 'transformer_blocks.13.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
206
- Tensor 'transformer_blocks.13.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
207
- Tensor 'transformer_blocks.14.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
208
- Tensor 'transformer_blocks.14.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
209
- Tensor 'transformer_blocks.14.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
210
- Tensor 'transformer_blocks.14.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
211
- Tensor 'transformer_blocks.14.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
212
- Tensor 'transformer_blocks.14.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
213
- Tensor 'transformer_blocks.14.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
214
- Tensor 'transformer_blocks.14.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
215
- Tensor 'transformer_blocks.14.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
216
- Tensor 'transformer_blocks.14.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
217
- Tensor 'transformer_blocks.14.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
218
- Tensor 'transformer_blocks.14.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
219
- Tensor 'transformer_blocks.14.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
220
- Tensor 'transformer_blocks.14.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
221
- Tensor 'transformer_blocks.14.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
222
- Tensor 'transformer_blocks.14.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
223
- Tensor 'transformer_blocks.14.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
224
- Tensor 'transformer_blocks.14.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
225
- Tensor 'transformer_blocks.14.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
226
- Tensor 'transformer_blocks.14.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
227
- Tensor 'transformer_blocks.14.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
228
- Tensor 'transformer_blocks.14.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
229
- Tensor 'transformer_blocks.14.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
230
- Tensor 'transformer_blocks.14.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
231
- Tensor 'transformer_blocks.2.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
232
- Tensor 'transformer_blocks.2.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
233
- Tensor 'transformer_blocks.2.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
234
- Tensor 'transformer_blocks.2.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
235
- Tensor 'transformer_blocks.2.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
236
- Tensor 'transformer_blocks.2.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
237
- Tensor 'transformer_blocks.2.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
238
- Tensor 'transformer_blocks.2.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
239
- Tensor 'transformer_blocks.2.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
240
- Tensor 'transformer_blocks.2.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
241
- Tensor 'transformer_blocks.2.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
242
- Tensor 'transformer_blocks.2.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
243
- Tensor 'transformer_blocks.2.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
244
- Tensor 'transformer_blocks.2.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
245
- Tensor 'transformer_blocks.2.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
246
- Tensor 'transformer_blocks.2.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
247
- Tensor 'transformer_blocks.2.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
248
- Tensor 'transformer_blocks.2.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
249
- Tensor 'transformer_blocks.2.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
250
- Tensor 'transformer_blocks.2.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
251
- Tensor 'transformer_blocks.2.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
252
- Tensor 'transformer_blocks.2.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
253
- Tensor 'transformer_blocks.2.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
254
- Tensor 'transformer_blocks.2.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
255
- Tensor 'transformer_blocks.2.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
256
- Tensor 'transformer_blocks.2.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
257
- Tensor 'transformer_blocks.2.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
258
- Tensor 'transformer_blocks.2.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
259
- Tensor 'transformer_blocks.2.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
260
- Tensor 'transformer_blocks.2.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
261
- Tensor 'transformer_blocks.2.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
262
- Tensor 'transformer_blocks.2.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
263
- Tensor 'transformer_blocks.3.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
264
- Tensor 'transformer_blocks.3.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
265
- Tensor 'transformer_blocks.3.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
266
- Tensor 'transformer_blocks.3.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
267
- Tensor 'transformer_blocks.3.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
268
- Tensor 'transformer_blocks.3.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
269
- Tensor 'transformer_blocks.3.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
270
- Tensor 'transformer_blocks.3.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
271
- Tensor 'transformer_blocks.3.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
272
- Tensor 'transformer_blocks.3.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
273
- Tensor 'transformer_blocks.3.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
274
- Tensor 'transformer_blocks.3.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
275
- Tensor 'transformer_blocks.3.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
276
- Tensor 'transformer_blocks.3.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
277
- Tensor 'transformer_blocks.3.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
278
- Tensor 'transformer_blocks.3.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
279
- Tensor 'transformer_blocks.3.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
280
- Tensor 'transformer_blocks.3.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
281
- Tensor 'transformer_blocks.3.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
282
- Tensor 'transformer_blocks.3.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
283
- Tensor 'transformer_blocks.3.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
284
- Tensor 'transformer_blocks.3.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
285
- Tensor 'transformer_blocks.3.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
286
- Tensor 'transformer_blocks.3.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
287
- Tensor 'transformer_blocks.3.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
288
- Tensor 'transformer_blocks.3.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
289
- Tensor 'transformer_blocks.3.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
290
- Tensor 'transformer_blocks.3.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
291
- Tensor 'transformer_blocks.3.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
292
- Tensor 'transformer_blocks.3.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
293
- Tensor 'transformer_blocks.3.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
294
- Tensor 'transformer_blocks.3.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
295
- Tensor 'transformer_blocks.4.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
296
- Tensor 'transformer_blocks.4.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
297
- Tensor 'transformer_blocks.4.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
298
- Tensor 'transformer_blocks.4.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
299
- Tensor 'transformer_blocks.4.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
300
- Tensor 'transformer_blocks.4.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
301
- Tensor 'transformer_blocks.4.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
302
- Tensor 'transformer_blocks.4.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
303
- Tensor 'transformer_blocks.4.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
304
- Tensor 'transformer_blocks.4.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
305
- Tensor 'transformer_blocks.4.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
306
- Tensor 'transformer_blocks.4.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
307
- Tensor 'transformer_blocks.4.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
308
- Tensor 'transformer_blocks.4.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
309
- Tensor 'transformer_blocks.4.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
310
- Tensor 'transformer_blocks.4.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
311
- Tensor 'transformer_blocks.4.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
312
- Tensor 'transformer_blocks.4.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
313
- Tensor 'transformer_blocks.4.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
314
- Tensor 'transformer_blocks.4.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
315
- Tensor 'transformer_blocks.4.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
316
- Tensor 'transformer_blocks.4.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
317
- Tensor 'transformer_blocks.4.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
318
- Tensor 'transformer_blocks.4.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
319
- Tensor 'transformer_blocks.4.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
320
- Tensor 'transformer_blocks.4.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
321
- Tensor 'transformer_blocks.4.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
322
- Tensor 'transformer_blocks.4.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
323
- Tensor 'transformer_blocks.4.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
324
- Tensor 'transformer_blocks.4.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
325
- Tensor 'transformer_blocks.4.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
326
- Tensor 'transformer_blocks.4.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
327
- Tensor 'transformer_blocks.5.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
328
- Tensor 'transformer_blocks.5.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
329
- Tensor 'transformer_blocks.5.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
330
- Tensor 'transformer_blocks.5.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
331
- Tensor 'transformer_blocks.5.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
332
- Tensor 'transformer_blocks.5.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
333
- Tensor 'transformer_blocks.5.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
334
- Tensor 'transformer_blocks.5.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
335
- Tensor 'transformer_blocks.5.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
336
- Tensor 'transformer_blocks.5.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
337
- Tensor 'transformer_blocks.5.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
338
- Tensor 'transformer_blocks.5.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
339
- Tensor 'transformer_blocks.5.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
340
- Tensor 'transformer_blocks.5.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
341
- Tensor 'transformer_blocks.5.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
342
- Tensor 'transformer_blocks.5.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
343
- Tensor 'transformer_blocks.5.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
344
- Tensor 'transformer_blocks.5.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
345
- Tensor 'transformer_blocks.5.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
346
- Tensor 'transformer_blocks.5.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
347
- Tensor 'transformer_blocks.5.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
348
- Tensor 'transformer_blocks.5.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
349
- Tensor 'transformer_blocks.5.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
350
- Tensor 'transformer_blocks.5.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
351
- Tensor 'transformer_blocks.5.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
352
- Tensor 'transformer_blocks.5.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
353
- Tensor 'transformer_blocks.5.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
354
- Tensor 'transformer_blocks.5.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
355
- Tensor 'transformer_blocks.5.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
356
- Tensor 'transformer_blocks.5.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
357
- Tensor 'transformer_blocks.5.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
358
- Tensor 'transformer_blocks.5.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
359
- Tensor 'transformer_blocks.6.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
360
- Tensor 'transformer_blocks.6.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
361
- Tensor 'transformer_blocks.6.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
362
- Tensor 'transformer_blocks.6.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
363
- Tensor 'transformer_blocks.6.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
364
- Tensor 'transformer_blocks.6.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
365
- Tensor 'transformer_blocks.6.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
366
- Tensor 'transformer_blocks.6.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
367
- Tensor 'transformer_blocks.6.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
368
- Tensor 'transformer_blocks.6.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
369
- Tensor 'transformer_blocks.6.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
370
- Tensor 'transformer_blocks.6.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
371
- Tensor 'transformer_blocks.6.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
372
- Tensor 'transformer_blocks.6.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
373
- Tensor 'transformer_blocks.6.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
374
- Tensor 'transformer_blocks.6.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
375
- Tensor 'transformer_blocks.6.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
376
- Tensor 'transformer_blocks.6.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
377
- Tensor 'transformer_blocks.6.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
378
- Tensor 'transformer_blocks.6.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
379
- Tensor 'transformer_blocks.6.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
380
- Tensor 'transformer_blocks.6.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
381
- Tensor 'transformer_blocks.6.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
382
- Tensor 'transformer_blocks.6.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
383
- Tensor 'transformer_blocks.6.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
384
- Tensor 'transformer_blocks.6.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
385
- Tensor 'transformer_blocks.6.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
386
- Tensor 'transformer_blocks.6.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
387
- Tensor 'transformer_blocks.6.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
388
- Tensor 'transformer_blocks.6.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
389
- Tensor 'transformer_blocks.6.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
390
- Tensor 'transformer_blocks.6.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
391
- Tensor 'transformer_blocks.7.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
392
- Tensor 'transformer_blocks.7.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
393
- Tensor 'transformer_blocks.7.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
394
- Tensor 'transformer_blocks.7.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
395
- Tensor 'transformer_blocks.7.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
396
- Tensor 'transformer_blocks.7.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
397
- Tensor 'transformer_blocks.7.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
398
- Tensor 'transformer_blocks.7.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
399
- Tensor 'transformer_blocks.7.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
400
- Tensor 'transformer_blocks.7.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
401
- Tensor 'transformer_blocks.7.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
402
- Tensor 'transformer_blocks.7.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
403
- Tensor 'transformer_blocks.7.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
404
- Tensor 'transformer_blocks.7.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
405
- Tensor 'transformer_blocks.7.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
406
- Tensor 'transformer_blocks.7.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
407
- Tensor 'transformer_blocks.7.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
408
- Tensor 'transformer_blocks.7.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
409
- Tensor 'transformer_blocks.7.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
410
- Tensor 'transformer_blocks.7.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
411
- Tensor 'transformer_blocks.7.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
412
- Tensor 'transformer_blocks.7.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
413
- Tensor 'transformer_blocks.7.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
414
- Tensor 'transformer_blocks.7.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
415
- Tensor 'transformer_blocks.7.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
416
- Tensor 'transformer_blocks.7.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
417
- Tensor 'transformer_blocks.7.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
418
- Tensor 'transformer_blocks.7.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
419
- Tensor 'transformer_blocks.7.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
420
- Tensor 'transformer_blocks.7.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
421
- Tensor 'transformer_blocks.7.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
422
- Tensor 'transformer_blocks.7.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
423
- Tensor 'transformer_blocks.8.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
424
- Tensor 'transformer_blocks.8.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
425
- Tensor 'transformer_blocks.8.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
426
- Tensor 'transformer_blocks.8.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
427
- Tensor 'transformer_blocks.8.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
428
- Tensor 'transformer_blocks.8.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
429
- Tensor 'transformer_blocks.8.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
430
- Tensor 'transformer_blocks.8.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
431
- Tensor 'transformer_blocks.8.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
432
- Tensor 'transformer_blocks.8.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
433
- Tensor 'transformer_blocks.8.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
434
- Tensor 'transformer_blocks.8.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
435
- Tensor 'transformer_blocks.8.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
436
- Tensor 'transformer_blocks.8.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
437
- Tensor 'transformer_blocks.8.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
438
- Tensor 'transformer_blocks.8.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
439
- Tensor 'transformer_blocks.8.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
440
- Tensor 'transformer_blocks.8.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
441
- Tensor 'transformer_blocks.8.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
442
- Tensor 'transformer_blocks.8.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
443
- Tensor 'transformer_blocks.8.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
444
- Tensor 'transformer_blocks.8.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
445
- Tensor 'transformer_blocks.8.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
446
- Tensor 'transformer_blocks.8.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
447
- Tensor 'transformer_blocks.8.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
448
- Tensor 'transformer_blocks.8.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
449
- Tensor 'transformer_blocks.8.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
450
- Tensor 'transformer_blocks.8.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
451
- Tensor 'transformer_blocks.8.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
452
- Tensor 'transformer_blocks.8.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
453
- Tensor 'transformer_blocks.8.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
454
- Tensor 'transformer_blocks.8.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
455
- Tensor 'transformer_blocks.9.attn.add_k_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
456
- Tensor 'transformer_blocks.9.attn.add_k_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
457
- Tensor 'transformer_blocks.9.attn.add_q_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
458
- Tensor 'transformer_blocks.9.attn.add_q_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
459
- Tensor 'transformer_blocks.9.attn.add_v_proj.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
460
- Tensor 'transformer_blocks.9.attn.add_v_proj.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
461
- Tensor 'transformer_blocks.9.attn.norm_added_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
462
- Tensor 'transformer_blocks.9.attn.norm_added_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
463
- Tensor 'transformer_blocks.9.attn.norm_k.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
464
- Tensor 'transformer_blocks.9.attn.norm_q.weight' has different shapes: Model 1: torch.Size([89]), Model 2: torch.Size([128])
465
- Tensor 'transformer_blocks.9.attn.to_add_out.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
466
- Tensor 'transformer_blocks.9.attn.to_add_out.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
467
- Tensor 'transformer_blocks.9.attn.to_k.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
468
- Tensor 'transformer_blocks.9.attn.to_k.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
469
- Tensor 'transformer_blocks.9.attn.to_out.0.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
470
- Tensor 'transformer_blocks.9.attn.to_out.0.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
471
- Tensor 'transformer_blocks.9.attn.to_q.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
472
- Tensor 'transformer_blocks.9.attn.to_q.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
473
- Tensor 'transformer_blocks.9.attn.to_v.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
474
- Tensor 'transformer_blocks.9.attn.to_v.weight' has different shapes: Model 1: torch.Size([2150, 3072]), Model 2: torch.Size([3072, 3072])
475
- Tensor 'transformer_blocks.9.ff.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
476
- Tensor 'transformer_blocks.9.ff.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
477
- Tensor 'transformer_blocks.9.ff.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
478
- Tensor 'transformer_blocks.9.ff.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
479
- Tensor 'transformer_blocks.9.ff_context.net.0.proj.bias' has different shapes: Model 1: torch.Size([8601]), Model 2: torch.Size([12288])
480
- Tensor 'transformer_blocks.9.ff_context.net.0.proj.weight' has different shapes: Model 1: torch.Size([8601, 3072]), Model 2: torch.Size([12288, 3072])
481
- Tensor 'transformer_blocks.9.ff_context.net.2.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
482
- Tensor 'transformer_blocks.9.ff_context.net.2.weight' has different shapes: Model 1: torch.Size([2150, 12288]), Model 2: torch.Size([3072, 12288])
483
- Tensor 'transformer_blocks.9.norm1.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
484
- Tensor 'transformer_blocks.9.norm1.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
485
- Tensor 'transformer_blocks.9.norm1_context.linear.bias' has different shapes: Model 1: torch.Size([12902]), Model 2: torch.Size([18432])
486
- Tensor 'transformer_blocks.9.norm1_context.linear.weight' has different shapes: Model 1: torch.Size([12902, 3072]), Model 2: torch.Size([18432, 3072])
487
- Tensor 'x_embedder.bias' has different shapes: Model 1: torch.Size([2150]), Model 2: torch.Size([3072])
488
- Tensor 'x_embedder.weight' has different shapes: Model 1: torch.Size([2150, 64]), Model 2: torch.Size([3072, 64])