andreaskoepf commited on
Commit
6dee96a
1 Parent(s): 8f4bc57

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ wandb: https://wandb.ai/open-assistant/supervised-finetuning/runs/t2adm3wu
6
+
7
+ checkpoint: 11000 step (2 epochs)
8
+
9
+ datasets:
10
+ ```
11
+ pretrain:
12
+ num_train_epochs: 1
13
+ weight_decay: 0.01
14
+ use_custom_sampler: true
15
+ sort_by_length: false
16
+ datasets:
17
+ - joke
18
+ - webgpt:
19
+ val_split: 0.1
20
+ - gpt4all:
21
+ val_split: 0.01
22
+ - alpaca:
23
+ val_split: 0.025
24
+ - code_alpaca:
25
+ val_split: 0.05
26
+ - minimath
27
+ - humaneval_mbpp_codegen_qa
28
+ - humaneval_mbpp_testgen_qa
29
+ - grade_school_math_instructions
30
+ - recipes
31
+ - cmu_wiki_qa
32
+ - oa_wiki_qa_bart_10000row
33
+ - prosocial_dialogue:
34
+ fraction: 0.1
35
+ - explain_prosocial:
36
+ fraction: 0.05
37
+ ```
38
+
39
+
40
+ pythia:
41
+ ```
42
+ pythia-1.4b-pretrain:
43
+ dtype: fp16
44
+ learning_rate: 6e-6
45
+ model_name: EleutherAI/pythia-1.4b-deduped
46
+ deepspeed_config: configs/zero_config_pretrain.json
47
+ weight_decay: 0.0
48
+ max_length: 2048
49
+ use_flash_attention: true
50
+ warmup_steps: 50
51
+ gradient_checkpointing: false
52
+ gradient_accumulation_steps: 1
53
+ per_device_train_batch_size: 16
54
+ per_device_eval_batch_size: 16
55
+ num_train_epochs: 2
56
+ save_total_limit: 2
57
+ ```