mikecovlee
commited on
Commit
•
7221067
1
Parent(s):
674303e
Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,13 @@ In addition, MixLoRA also allows simultaneous fine-tuning of the attention layer
|
|
15 |
|
16 |
MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
## Configuration of MixLoRA
|
19 |
|
20 |
Compared with LoRA, MixLoRA have some additional configurations.
|
@@ -132,14 +139,14 @@ Please cite the repo if you use the code in this repo.
|
|
132 |
@misc{alpaca-mixlora-7b,
|
133 |
author = {Dengchun, Li and Tingfeng, Lan and Zhengmao, Ye and Lei, Duan and Mingjie, Tang},
|
134 |
title = {MixLoRA MoE model based on AlpacaCleaned dataset and LLaMA-7B base model},
|
135 |
-
year = {
|
136 |
publisher = {HuggingFace Hub},
|
137 |
howpublished = {\url{https://huggingface.co/scu-kdde/alpaca-mixlora-7b}},
|
138 |
}
|
139 |
```
|
140 |
|
141 |
## Copyright
|
142 |
-
Copyright © 2023 All Rights Reserved.
|
143 |
|
144 |
This project is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
|
145 |
|
|
|
15 |
|
16 |
MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
|
17 |
|
18 |
+
## MMLU Scores
|
19 |
+
|
20 |
+
|Model|Configuration|MMLU Average|STEM|Social Sciences|Humanities|Other|
|
21 |
+
|-----------------|---------------------------------|--------|--------|--------|--------|--------|
|
22 |
+
|Alpaca-LoRA-7B |LoRA Rank = 16, QKVO | 24.2 | 24.1 |**25.0**| 25.2 | 22.7 |
|
23 |
+
|Alpaca-MixLoRA-7B|LoRA Rank = 8, Top-2 of 8 Experts|**25.5**|**26.1**| 23.3 |**25.3**|**26.9**|
|
24 |
+
|
25 |
## Configuration of MixLoRA
|
26 |
|
27 |
Compared with LoRA, MixLoRA have some additional configurations.
|
|
|
139 |
@misc{alpaca-mixlora-7b,
|
140 |
author = {Dengchun, Li and Tingfeng, Lan and Zhengmao, Ye and Lei, Duan and Mingjie, Tang},
|
141 |
title = {MixLoRA MoE model based on AlpacaCleaned dataset and LLaMA-7B base model},
|
142 |
+
year = {2024},
|
143 |
publisher = {HuggingFace Hub},
|
144 |
howpublished = {\url{https://huggingface.co/scu-kdde/alpaca-mixlora-7b}},
|
145 |
}
|
146 |
```
|
147 |
|
148 |
## Copyright
|
149 |
+
Copyright © 2023-2024 All Rights Reserved.
|
150 |
|
151 |
This project is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
|
152 |
|