Joseph717171
commited on
Commit
•
64a393f
1
Parent(s):
d51c1a9
Update README.md
Browse files
README.md
CHANGED
@@ -4,10 +4,18 @@ library_name: transformers
|
|
4 |
tags:
|
5 |
- mergekit
|
6 |
- merge
|
7 |
-
|
8 |
---
|
|
|
|
|
|
|
9 |
# multi_verse_model-10.7B
|
10 |
|
|
|
|
|
|
|
|
|
|
|
11 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
12 |
|
13 |
## Merge Details
|
|
|
4 |
tags:
|
5 |
- mergekit
|
6 |
- merge
|
7 |
+
license: apache-2.0
|
8 |
---
|
9 |
+
|
10 |
+
# Credit for the model card's description goes to ddh0, mergekit, and, MTSAIR
|
11 |
+
|
12 |
# multi_verse_model-10.7B
|
13 |
|
14 |
+
This is multi_verse_model-10.7B, a depth-upscaled version of [MTSAIR/multi_verse_model](https://huggingface.co/MTSAIR/multi_verse_model).
|
15 |
+
|
16 |
+
This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
|
17 |
+
|
18 |
+
Paper detailing how Depth-Up Scaling works: [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
|
19 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
20 |
|
21 |
## Merge Details
|