Teja-Gollapudi
commited on
Commit
·
b093c27
1
Parent(s):
fea26e5
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,7 @@
|
|
1 |
---
|
2 |
license: cc-by-3.0
|
3 |
datasets:
|
4 |
-
- VMware/open-instruct
|
5 |
-
- conceptofmind/cot_submix_original
|
6 |
language:
|
7 |
- en
|
8 |
library_name: transformers
|
@@ -15,9 +14,11 @@ Instruction-tuned version of SalesForce/Xgen-7b-8k-base. The model is open for <
|
|
15 |
<b> NOTE </b> : The model was trained using the Alpaca prompt template <br>
|
16 |
<b> NOTE </b> : tiktoken library is required for the tokenizer. Set trust_remote_code=True when launching the tokenizer.<br>
|
17 |
|
18 |
-
We expanded Open-instruct with additional commercially viable zero-shot COT datasets from Flan v2
|
19 |
|
20 |
|
|
|
|
|
21 |
Open-instruct-v1
|
22 |
- Mosaic/Dolly-HHRLHF + filtered OASST1 - cc by 3.0
|
23 |
|
@@ -38,8 +39,9 @@ The model supports up to <b>8192 tokens </b>
|
|
38 |
|
39 |
## License
|
40 |
- <b>Commercially Viable </b>
|
41 |
-
- The instruction datasets used for instruction tuning are open for commercial usage.
|
42 |
- Language Model, ([Salesforce/xgen-7b-8k-base](https://huggingface.co/Salesforce/xgen-7b-8k-base)) is under apache-2.0
|
|
|
43 |
|
44 |
|
45 |
|
|
|
1 |
---
|
2 |
license: cc-by-3.0
|
3 |
datasets:
|
4 |
+
- VMware/open-instruct
|
|
|
5 |
language:
|
6 |
- en
|
7 |
library_name: transformers
|
|
|
14 |
<b> NOTE </b> : The model was trained using the Alpaca prompt template <br>
|
15 |
<b> NOTE </b> : tiktoken library is required for the tokenizer. Set trust_remote_code=True when launching the tokenizer.<br>
|
16 |
|
17 |
+
We expanded Open-instruct with additional commercially viable zero-shot COT datasets from Flan v2 to total of 140k instruct-prompt responses. <br>
|
18 |
|
19 |
|
20 |
+
<b>Open-instruct <br>
|
21 |
+
|
22 |
Open-instruct-v1
|
23 |
- Mosaic/Dolly-HHRLHF + filtered OASST1 - cc by 3.0
|
24 |
|
|
|
39 |
|
40 |
## License
|
41 |
- <b>Commercially Viable </b>
|
42 |
+
- The instruction datasets used for instruction tuning are open for commercial usage.
|
43 |
- Language Model, ([Salesforce/xgen-7b-8k-base](https://huggingface.co/Salesforce/xgen-7b-8k-base)) is under apache-2.0
|
44 |
+
- Dataset ([VMware/open-instruct](https://huggingface.co/datasets/VMware/open-instruct)) is under cc-by-sa-3.0
|
45 |
|
46 |
|
47 |
|