Update README.md
Browse files
README.md
CHANGED
@@ -225,7 +225,7 @@ parameters:
|
|
225 |
- value: 0.5
|
226 |
dtype: bfloat16 #Oops accidentally swtich to half precision do this also very important
|
227 |
|
228 |
-
#10. Heal the layers, o_proj and down_proj seems to be the
|
229 |
#this way we don't need to finetune our new frankenmerge at all to have full performance. Why reinvent the wheel?
|
230 |
#sapphire
|
231 |
models:
|
|
|
225 |
- value: 0.5
|
226 |
dtype: bfloat16 #Oops accidentally swtich to half precision do this also very important
|
227 |
|
228 |
+
#10. Heal the layers, o_proj and down_proj seems to be the main tensors that determine adaptation to a new architecture, so we can steal them from an already finetuned 15B,
|
229 |
#this way we don't need to finetune our new frankenmerge at all to have full performance. Why reinvent the wheel?
|
230 |
#sapphire
|
231 |
models:
|