chargoddard
commited on
Commit
•
4f456d9
1
Parent(s):
ced8eaf
Update README.md
Browse files
README.md
CHANGED
@@ -114,7 +114,8 @@ Full weight fine tuned on two epochs of [SlimOrca](https://huggingface.co/datase
|
|
114 |
|
115 |
The base model for this came from a variation on Undi's [Mistral 11B recipe](https://huggingface.co/Undi95/Mistral-11B-v0.1). The `o_proj` and `down_proj` tensors were set to zero in the added layers, making the output exactly identical to Mistral 7B before training.
|
116 |
|
117 |
-
Benchmarks look good locally but still evaluating actual usefulness
|
|
|
118 |
|
119 |
|
120 |
### Reproducing
|
|
|
114 |
|
115 |
The base model for this came from a variation on Undi's [Mistral 11B recipe](https://huggingface.co/Undi95/Mistral-11B-v0.1). The `o_proj` and `down_proj` tensors were set to zero in the added layers, making the output exactly identical to Mistral 7B before training.
|
116 |
|
117 |
+
~Benchmarks look good locally but still evaluating actual usefulness.~
|
118 |
+
Update: this turned out great! 10/10 would recommend as a training approach.
|
119 |
|
120 |
|
121 |
### Reproducing
|