Update README.md
Browse files
README.md
CHANGED
@@ -95,6 +95,14 @@ whereas the general (non-orthogonal) "multiplicative-LoRA" method can (in theory
|
|
95 |
|
96 |
`h' = h + u @ v^T @ h`
|
97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
98 |
In general, the way to think about these (non-orthogonal) "multiplicative-LoRAs" is as a kind of "conditional control-vector":
|
99 |
|
100 |
- Each vector in `lora_A` looks for a certain dirrection, and via the dot-product it generates a (signed) weighting factor that measures the similarity between the output of the `down_proj` transformation and the specific vector in `lora_A`.
|
|
|
95 |
|
96 |
`h' = h + u @ v^T @ h`
|
97 |
|
98 |
+
It can also (in theory) perform [Householder Transformation(s)](https://en.wikipedia.org/wiki/Householder_transformation):
|
99 |
+
|
100 |
+
`h' = h - 2 * v @ v^T @ h`
|
101 |
+
|
102 |
+
by choosing to set `u = -2v` like so:
|
103 |
+
|
104 |
+
`h' = h + u @ v^T @ h`
|
105 |
+
|
106 |
In general, the way to think about these (non-orthogonal) "multiplicative-LoRAs" is as a kind of "conditional control-vector":
|
107 |
|
108 |
- Each vector in `lora_A` looks for a certain dirrection, and via the dot-product it generates a (signed) weighting factor that measures the similarity between the output of the `down_proj` transformation and the specific vector in `lora_A`.
|