Update README.md
Browse files
README.md
CHANGED
@@ -11,20 +11,13 @@ language:
|
|
11 |
---
|
12 |
# WestLake-10.7B-v2: Role-Play & Text Generation Specialist Model
|
13 |
|
14 |
-
|
15 |
-
* Context size: **8192** (even though Mistral-7B is 32k, WestLake was trained with 8k, and using a larger context is likely to cause problems)
|
16 |
-
* Prompt format: in general, Mistral based models are able to understand many prompt formats, but the following ones produce the best results, and are recommended
|
17 |
-
- **ChatML** (used during WestLake training)
|
18 |
-
- **Zephyr** (variant of ChatML which sometimes produces better results)
|
19 |
-
- **Alpaca** (reported by senseable as working better than ChatML)
|
20 |
-
- **Mistral Instruct** (original format from Mistral-7B)
|
21 |
|
22 |
-
This is my first viable self-merge of the fantastic WestLake-7B-v2 model, obtained after 12 rounds of testing
|
23 |
-
merge
|
24 |
-
and goliath-120b! I would describe the improvements as a better writing style, with more details. It
|
25 |
-
a small negative point, which is it has a bit more difficulties following instruction, but not by much.
|
26 |
|
27 |
-
It is also the first model I have tested to obtain a perfect score with the following test
|
28 |
```
|
29 |
Write a sequence of nominal groups that flow into one another, using the following rules:
|
30 |
- each nominal group is made of exactly 3 words
|
@@ -38,21 +31,23 @@ Present your solution as a list numbered with roman numerals.
|
|
38 |
Finally, explain why you chose your specific theme.
|
39 |
```
|
40 |
|
41 |
-
##
|
42 |
-
|
43 |
-
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
44 |
-
|
45 |
-
### Merge Method
|
46 |
|
47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
|
50 |
|
|
|
|
|
51 |
The following models were included in the merge:
|
52 |
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
|
53 |
|
54 |
-
### Configuration
|
55 |
-
|
56 |
The following YAML configuration was used to produce this model:
|
57 |
|
58 |
```yaml
|
@@ -78,13 +73,11 @@ slices:
|
|
78 |
|
79 |
---
|
80 |
|
81 |
-
# Original model card
|
82 |
|
83 |
**Update Notes:**
|
84 |
*Version 2 trained 1 additional epoch cycle for 3 total*
|
85 |
|
86 |
-
# Westlake-7Bv2: Role-Play & Text Generation Specialist Model
|
87 |
-
|
88 |
Welcome to the documentation of Westlake-7B, a cutting-edge language model designed for exceptional role-play and text generation tasks. This README file aims to provide an overview of our capabilities, usage guidelines, and potential applications.
|
89 |
|
90 |
## About Westlake-7Bv2
|
|
|
11 |
---
|
12 |
# WestLake-10.7B-v2: Role-Play & Text Generation Specialist Model
|
13 |
|
14 |
+
[GGUF version available here](https://huggingface.co/froggeric/WestLake-10.7B-v2-GGUF)
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
+
This is my first viable self-merge of the fantastic WestLake-7B-v2 model, obtained after more than 12 rounds of testing different
|
17 |
+
merge configurations. In my [LLM Creativity Benchmark](https://huggingface.co/datasets/froggeric/creativity), it greatly improves over the original 7B model, and ranks between miqu-1-120b
|
18 |
+
and goliath-120b! I would describe the improvements as a better writing style, with more details. It has a bit more difficulties following instructions, but not by much.
|
|
|
19 |
|
20 |
+
It is also the first model I have tested to obtain a perfect score with the following test:
|
21 |
```
|
22 |
Write a sequence of nominal groups that flow into one another, using the following rules:
|
23 |
- each nominal group is made of exactly 3 words
|
|
|
31 |
Finally, explain why you chose your specific theme.
|
32 |
```
|
33 |
|
34 |
+
## Usage
|
|
|
|
|
|
|
|
|
35 |
|
36 |
+
* Base model: senseable/WestLake-7B-v2 based of Mistral-7B-v0.1
|
37 |
+
* Context size: **8192** (even though Mistral-7B is 32k, WestLake was trained with 8k, and using a larger context is likely to cause problems)
|
38 |
+
* Prompt format: in general, Mistral based models are able to understand many prompt formats, but the following produce the best results, and are recommended
|
39 |
+
- **ChatML** (used during WestLake training)
|
40 |
+
- **Zephyr** (variant of ChatML which I have found to sometimes produce better results)
|
41 |
+
- **Alpaca** (reported by senseable as working better than ChatML)
|
42 |
+
- **Mistral Instruct** (original format from Mistral-7B)
|
43 |
|
44 |
+
## Merge Details
|
45 |
|
46 |
+
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).\
|
47 |
+
This model was merged using the passthrough merge method.\
|
48 |
The following models were included in the merge:
|
49 |
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
|
50 |
|
|
|
|
|
51 |
The following YAML configuration was used to produce this model:
|
52 |
|
53 |
```yaml
|
|
|
73 |
|
74 |
---
|
75 |
|
76 |
+
# Original model card: Westlake-7Bv2: Role-Play & Text Generation Specialist Model
|
77 |
|
78 |
**Update Notes:**
|
79 |
*Version 2 trained 1 additional epoch cycle for 3 total*
|
80 |
|
|
|
|
|
81 |
Welcome to the documentation of Westlake-7B, a cutting-edge language model designed for exceptional role-play and text generation tasks. This README file aims to provide an overview of our capabilities, usage guidelines, and potential applications.
|
82 |
|
83 |
## About Westlake-7Bv2
|