Paul Rock commited on
Commit
e61e1ad
β€’
1 Parent(s): 67d7168

Readme tuned, generation_config.json added

Browse files
Files changed (2) hide show
  1. README.md +24 -6
  2. generation_config.json +15 -0
README.md CHANGED
@@ -23,15 +23,31 @@ tags:
23
  - adapter
24
  ---
25
 
26
- # ruGPT-3.5 13B LoRA
27
 
28
- This is an adapter-only version, based on [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B).
29
 
30
- Training code is [here](https://github.com/EvilFreelancer/ruGPT-3.5-13B-lora)
31
 
32
- > You may use ruGPT-3.5 13B fp16 base model instead.
33
 
34
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  The following `bitsandbytes` quantization config was used during training:
37
 
@@ -46,7 +62,9 @@ The following `bitsandbytes` quantization config was used during training:
46
  - bnb_4bit_use_double_quant: False
47
  - bnb_4bit_compute_dtype: float32
48
 
49
- ### Framework versions
 
 
50
 
51
  - PyTorch 2.1.0
52
  - PEFT 0.5.0
 
23
  - adapter
24
  ---
25
 
26
+ # ruGPT-3.5 13B LoRA: Adapter-Only Version
27
 
28
+ Welcome to the adapter-only version of ruGPT-3.5 13B LoRA. This model is built upon the foundation of [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B).
29
 
30
+ πŸ“Œ Important: This model was trained using settings identical to [GigaSaiga](https://huggingface.co/IlyaGusev/gigasaiga_lora), but incorporates two additional datasets.
31
 
32
+ πŸ”— Training code is [here](https://github.com/EvilFreelancer/ruGPT-3.5-13B-lora).
33
 
34
+ > Note: If you prefer, you can opt to use the ruGPT-3.5 13B fp16 base model.
35
+
36
+ ## πŸ“š Training Datasets
37
+
38
+ The datasets utilized for training this model are consistent with those used for [Saiga-2](https://github.com/IlyaGusev/rulm).
39
+
40
+ Here's the comprehensive list:
41
+
42
+ - [ru_turbo_alpaca](https://huggingface.co/datasets/IlyaGusev/ru_turbo_alpaca)
43
+ - [ru_turbo_alpaca_evol_instruct](https://huggingface.co/datasets/IlyaGusev/ru_turbo_alpaca_evol_instruct)
44
+ - [ru_turbo_saiga](https://huggingface.co/datasets/IlyaGusev/ru_turbo_saiga)
45
+ - [ru_sharegpt_cleaned](https://huggingface.co/datasets/IlyaGusev/ru_sharegpt_cleaned)
46
+ - [oasst1_ru_main_branch](https://huggingface.co/datasets/IlyaGusev/oasst1_ru_main_branch)
47
+ - [gpt_roleplay_realm](https://huggingface.co/datasets/IlyaGusev/gpt_roleplay_realm)
48
+ - [ru_instruct_gpt4](https://huggingface.co/datasets/lksy/ru_instruct_gpt4)
49
+
50
+ ## πŸ›  Training Procedure
51
 
52
  The following `bitsandbytes` quantization config was used during training:
53
 
 
62
  - bnb_4bit_use_double_quant: False
63
  - bnb_4bit_compute_dtype: float32
64
 
65
+ ## βš™οΈ Framework Versions
66
+
67
+ Ensure you have the following framework versions for compatibility:
68
 
69
  - PyTorch 2.1.0
70
  - PEFT 0.5.0
generation_config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 2,
4
+ "eos_token_id": 3,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.34.0",
7
+ "temperature": 0.2,
8
+ "top_p": 0.9,
9
+ "top_k": 30,
10
+ "do_sample": true,
11
+ "max_new_tokens": 1536,
12
+ "num_beams": 1,
13
+ "repetition_penalty": 1.15,
14
+ "no_repeat_ngram_size": 15
15
+ }