danielpark
commited on
Commit
•
5027184
1
Parent(s):
a898ed4
Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ pipeline_tag: text-generation
|
|
19 |
<br>
|
20 |
|
21 |
KORANI is derived from GORANI, a project within llama2 that experiments with the distribution of appropriate datasets to transfer or distill knowledge based on English datasets. Officially, it's called Grid Of Ranvier Node In llama2 (GORANI), based on the biological term Ranvier Node, and aims to explore the optimal dataset for transferring knowledge in various languages and specific domains. Due to strict licensing issues with English datasets, GORANI is primarily for research purposes. Therefore, we are refining and training a commercially usable Korean dataset on top of llama2, based on the experimental results of the GORANI project, and this project is named KORANI (Korean GORANI).
|
22 |
-
-
|
23 |
- Please do not use the current model weights as they are not useful.
|
24 |
The most stringent non-commercial use license (CC-BY-NC-4.0) among the licenses of the datasets used for training is also applied to the model weights.
|
25 |
- On 2023-11-12, it was decided that all projects would be kept private. (It may be released in a non-public model format on cloud platforms by 2024.)
|
|
|
19 |
<br>
|
20 |
|
21 |
KORANI is derived from GORANI, a project within llama2 that experiments with the distribution of appropriate datasets to transfer or distill knowledge based on English datasets. Officially, it's called Grid Of Ranvier Node In llama2 (GORANI), based on the biological term Ranvier Node, and aims to explore the optimal dataset for transferring knowledge in various languages and specific domains. Due to strict licensing issues with English datasets, GORANI is primarily for research purposes. Therefore, we are refining and training a commercially usable Korean dataset on top of llama2, based on the experimental results of the GORANI project, and this project is named KORANI (Korean GORANI).
|
22 |
+
- I have conducted preliminary experiments using various techniques such as RoPE scaling, Attention Sinks, and Flash Attention 1 and 2, SWA(Sliding Window Attention), GQA(Grouped Query Attention).
|
23 |
- Please do not use the current model weights as they are not useful.
|
24 |
The most stringent non-commercial use license (CC-BY-NC-4.0) among the licenses of the datasets used for training is also applied to the model weights.
|
25 |
- On 2023-11-12, it was decided that all projects would be kept private. (It may be released in a non-public model format on cloud platforms by 2024.)
|