andreaskoepf commited on
Commit
1182ca8
1 Parent(s): 722df9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ Preliminary info during eval. Model card will be updated.
6
+
7
+ wandb: wandb: https://wandb.ai/open-assistant/supervised-finetuning/runs/3lr77x4h
8
+ export: 560 steps
9
+
10
+
11
+ Model:
12
+ ```
13
+ falcon-40b:
14
+ dtype: bf16
15
+ log_dir: "falcon_log_40b"
16
+ learning_rate: 5e-6
17
+ model_name: "tiiuae/falcon-40b"
18
+ deepspeed_config: configs/zero3_config_falcon.json
19
+ output_dir: falcon
20
+ weight_decay: 0.0
21
+ max_length: 2048
22
+ warmup_steps: 20
23
+ gradient_checkpointing: true
24
+ gradient_accumulation_steps: 1
25
+ per_device_train_batch_size: 18
26
+ per_device_eval_batch_size: 10
27
+ eval_steps: 80
28
+ save_steps: 80
29
+ num_train_epochs: 8
30
+ save_total_limit: 4
31
+ use_flash_attention: false
32
+ residual_dropout: 0.3
33
+ residual_dropout_lima: true
34
+ sort_by_length: false
35
+ save_strategy: steps
36
+ ```
37
+
38
+ Dataset:
39
+ ```
40
+ oasst_only:
41
+ save_strategy: epoch
42
+ datasets:
43
+ - oasst_export:
44
+ lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk"
45
+ input_file_path: 2023-04-04_oasst_ready.jsonl.gz
46
+ val_split: 0.05
47
+ sort_by_length: false
48
+ use_custom_sampler: false
49
+ ```