stefan-it
/

xlstm-german-wikipedia

Text Generation

Model card Files Files and versions Community

stefan-it commited on Sep 6

Commit

01d2cb6

•

1 Parent(s): ba44f7a

readme: fix markdown

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -19,8 +19,8 @@ Initially, we integrated xLSTM model training into Flair - for more information
 # Changelog
 - 06.09.2024: We discovered a (potential) bug in pretraining code: when using the complete Wikipedia corpus, unfortunately only the first 512 subtoken of each article are used.
--             We implement a grouping-based approach that tokenizes the whole corpus and groups the corpus into 512 subtoken chunks.
--             Pretraining with this new approach is currently running.
 - 29.08.2024: Uploaded re-trained model for 1 epoch over complete German Wikipedia corpus. Training was done with gradient clipping (0.25).
 - 28.08.2024: Model training is now done with [Helibrunna](https://github.com/AI-Guru/helibrunna) fork - find it [here](https://github.com/HallerPatrick/helibrunna).
 - 10.06.2024: Initial version. xLSTM was trained with Flair library, see this [old](https://huggingface.co/stefan-it/xlstm-german-wikipedia/blob/flair-old/README.md) branch.

 # Changelog
 - 06.09.2024: We discovered a (potential) bug in pretraining code: when using the complete Wikipedia corpus, unfortunately only the first 512 subtoken of each article are used.
+              We implement a grouping-based approach that tokenizes the whole corpus and groups the corpus into 512 subtoken chunks.
+              Pretraining with this new approach is currently running.
 - 29.08.2024: Uploaded re-trained model for 1 epoch over complete German Wikipedia corpus. Training was done with gradient clipping (0.25).
 - 28.08.2024: Model training is now done with [Helibrunna](https://github.com/AI-Guru/helibrunna) fork - find it [here](https://github.com/HallerPatrick/helibrunna).
 - 10.06.2024: Initial version. xLSTM was trained with Flair library, see this [old](https://huggingface.co/stefan-it/xlstm-german-wikipedia/blob/flair-old/README.md) branch.