Envoid commited on
Commit
3d191c7
1 Parent(s): 85aaaa9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -44,7 +44,7 @@ Using [qlora-pipe](https://github.com/tdrussell/qlora-pipe) I ran a qlora on Nem
44
 
45
  The atypically high dropout rate was chosen after some unreleased experimentation inspired by the Arxiv paper: [Fine-tuning with Very Large Dropout (Jianyu Zhang, Léon Bottou)](https://arxiv.org/abs/2403.00946)
46
 
47
- Which prescribes the use of a very high dropout rate (0.9 in their case) as a method of improving out-of-domain performance. Further discussion on various internet spaces regarding high dropout training lead to a recommendation of 0.6 as the ideal dropout rate for optimal fitting during finetuning.
48
 
49
  # Merging
50
 
 
44
 
45
  The atypically high dropout rate was chosen after some unreleased experimentation inspired by the Arxiv paper: [Fine-tuning with Very Large Dropout (Jianyu Zhang, Léon Bottou)](https://arxiv.org/abs/2403.00946)
46
 
47
+ Which prescribes the use of a very high dropout rate (0.9 in their case) as a method of improving out-of-distribution performance. Further discussion on various internet spaces regarding high dropout training lead to a recommendation of 0.6 as the ideal dropout rate for optimal fitting during finetuning.
48
 
49
  # Merging
50