chaoscodes commited on
Commit
73915cd
1 Parent(s): 0b659ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -21
README.md CHANGED
@@ -56,25 +56,13 @@ Here we list our data distribution in each stage:
56
 
57
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
58
  | ------------- | ----------------- | ------------------------------------------ | -------- |
59
- | RedPajamaBook | 5.4 | 5.4 | 5.4 |
60
- | C4 | 35.0 | 35.0 | 35.0 |
61
- | CommonCrawl | 70.1 | 70.1 | 70.1 |
62
- | Github | 6.5 | 6.5 | 6.5 |
63
- | StackExchange | 4.2 | 4.2 | 4.2 |
64
- | ArXiv | 5.7 | 5.7 | 5.7 |
65
- | Wikipedia | 4.5 | 4.5 | 4.5 |
66
 
67
  ### TinyLlama_v1.1_math_code
68
 
69
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
70
  | ------------- | ----------------- | ------------------------------------------ | -------- |
71
- | RedPajamaBook | 5.4 | - | - |
72
- | C4 | 35.0 | 21.6 | 21.6 |
73
- | CommonCrawl | 70.1 | 43.0 | 43.0 |
74
- | Github | 6.5 | - | - |
75
- | StackExchange | 4.2 | 2.6 | 2.6 |
76
- | ArXiv | 5.7 | 5.0 | 5.0 |
77
- | Wikipedia | 4.5 | 2.8 | 2.8 |
78
  | starcoder | - | 15.0 | 15.0 |
79
  | proof_pile | - | 10.0 | 10.0 |
80
 
@@ -82,13 +70,7 @@ Here we list our data distribution in each stage:
82
 
83
  | orpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
84
  | ------------- | ----------------- | ------------------------------------------ | -------- |
85
- | RedPajamaBook | 5.4 | - | - |
86
- | C4 | 35.0 | 14.6 | 14.6 |
87
- | CommonCrawl | 70.1 | 29.3 | 29.3 |
88
- | Github | 6.5 | - | - |
89
- | StackExchange | 4.2 | 1.8 | 1.8 |
90
- | ArXiv | 5.7 | 2.4 | 2.4 |
91
- | Wikipedia | 4.5 | 1.9 | 1.9 |
92
  | skypile | - | 50.0 | 50.0 |
93
 
94
  ### How to use
 
56
 
57
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
58
  | ------------- | ----------------- | ------------------------------------------ | -------- |
59
+ | Slimpajama | 100.0 | 100.0 | 100.0 |
 
 
 
 
 
 
60
 
61
  ### TinyLlama_v1.1_math_code
62
 
63
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
64
  | ------------- | ----------------- | ------------------------------------------ | -------- |
65
+ | Slimpajama | 100.0 | 75.0 | 75.0 |
 
 
 
 
 
 
66
  | starcoder | - | 15.0 | 15.0 |
67
  | proof_pile | - | 10.0 | 10.0 |
68
 
 
70
 
71
  | orpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
72
  | ------------- | ----------------- | ------------------------------------------ | -------- |
73
+ | Slimpajama | 100.0 | 50.0 | 50.0 |
 
 
 
 
 
 
74
  | skypile | - | 50.0 | 50.0 |
75
 
76
  ### How to use