phonemetransformers 's Collections

From Babble to Words

The models, tokenizers and datasets used for our BabyLM 2024 submission. We have eight prediction files (predictions.json.gz) - the best is BPE-TXT.