Text Generation
Transformers
Safetensors
Czech
mpt
custom_code
text-generation-inference
Inference Endpoints
mfajcik commited on
Commit
e6568ef
1 Parent(s): 00cbbb7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -37,7 +37,7 @@ The model was trained on 3 corpora, which were hot-swapped during the training.
37
  <img src="figures/tloss_full.png" width="900"/>
38
  Figure 1: Training loss.
39
  <img src="figures/tloss_closeup.png" width="900"/>
40
- Figure 2: Training loss closeup. We mark two hotswap places, where the training corpus #1 was switched for internal-corpus #2 and internal-corpus #2.1 respectively.
41
 
42
  Additionaly, we perform two ablations:
43
 
 
37
  <img src="figures/tloss_full.png" width="900"/>
38
  Figure 1: Training loss.
39
  <img src="figures/tloss_closeup.png" width="900"/>
40
+ Figure 2: Training loss closeup. We mark two hotswap places, where the training corpus #1 was switched for internal-corpus #2 and internal-corpus #2.1 respectively. The flat region between 112k steps and 119.5k steps is caused by missing data---due to an accident, we lost these logs.
41
 
42
  Additionaly, we perform two ablations:
43