TeeZee commited on
Commit
6c859be
1 Parent(s): 67ac562

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - athirdpath/Merge_Glue
5
+ ---
6
+
7
+ ### TeeZee/NEBULA-XB-v1.0_SFT_2_epoch ###
8
+
9
+ Experiment, can DUS be taken one or more steps further?
10
+
11
+
12
+ ### Technical notes:
13
+ - pretrained model NEBULA-XB-v1.0 finetuned on 30k entries from Merge_Glue dataset
14
+ - 18 layers removed from both models of finetuned GALAXY-XB-v03
15
+ - model has 108 layers (((48-12)*2)-18)*2 = 108
16
+ - second step in scaling DUS procedure
17
+
18
+
19
+ ### To evaluate
20
+ - model performance after merge, should be a little lover that GALAXY finetuned on 50k of slimorca