danielpark's picture
Update README.md
2d6c63f
|
raw
history blame
4.48 kB
metadata
datasets:
  - danielpark/gorani-100k-llama2-13b-instruct
language:
  - en
library_name: >-
  bitsandbytes, transformers, peft, accelerate, bitsandbytes, datasets,
  deepspeed, trl
pipeline_tag: text-generation

I am seeking a company that can provide support for a minimum of 2 or more A100 GPUs.

I'm interested in companies based outside of South Korea that can provide resources for me to manage directly and exclusively, either through SSH tunneling or cloud services, for a specific duration. If your company is interested in discussing the terms and conditions of resource support, please feel free to contact me via email at parkminwoo1991@gmail.com. Your company will have partial access to my dataset and training code. However, I will retain full decision-making authority over the project, and we can discuss any further agreements in the future. Thank you for your consideration. Please note that companies with Korean employees, offices in South Korea, or a specific focus on AI-related businesses in South Korea, as well as those engaged in hiring within South Korea, are not eligible. These exclusions will be clearly outlined and confirmed through the drafting and signing of a legally binding contract.

The project is currently in progress. Please refrain from using weights and datasets.

  • We are currently conducting experiments using various techniques such as max sequence length, rope scaling, attention sinks, and flash attention 2. Please do not use the current model weights as they are not useful. In respect of the licensing of the dataset used for training, we have adopted a non-commercial use (CC-BY-NC-4.0) license. Once the training is complete, we will provide information about the datasets used along with the official release.
  • For GORANI, it is intended for research purposes, and for the Korean language model, KORANI, it can be used under a commercial use license.

Status: 19.7k check point weights open, waiting for the results on the LLM leaderboard.

Update Schedule Task Description Status
23-10-05 Completed training - 19.7k 13b weight (specific data) Done
23-10-06 Submitted hf model weights (REV 01) Done
23-10-20 Q.C On Process
23-10-17 Completed training - 50k 13b weight
23-10-17 Q.C
23-10-18 Submitted hf model weights
23-10-28 Completed training - 100k 13b weight
23-10-30 Q.C
23-10-31 Q.A
23-11-01 Official weight release

GORANI 100k

  • Model: danielpark/gorani-100k-llama2-13b-instruct
  • Dataset: danielpark/gorani-100k
  • License: This model is licensed under the Meta's LLaMA2 license. You may not use it commercially, and you must adhere to the licenses of the included datasets. Therefore, we currently adopt the strictest and most restrictive license. Please refrain from using it for commercial purposes under any circumstances until an official license is issued.

Template

I use llama2-13b with LFM, but I have used it without a default system message. If a system message is specified in some datasets, I use that content.

### System:
{System}

### User:
{New_User_Input}

### Input:
{New User Input}

### Assistant:
{New_Assistant_Answer}

Caution

The model weights and dataset have not been properly curated yet and are strictly prohibited for use under any license. In relation to this, the developers do not assume any responsibility, either implicitly or explicitly.

Updates

Revision Commit Hash Updated Train Process Status
Revision 01 6d30494fa8da84128499d55075eef57094336d03 23.10.04 19,740/100,000 On Training