Spaces:

du-lab
/

MLR-Copilot

Runtime error

App Files Files Community

MLR-Copilot / problems /babylm

Lim0011's picture

Upload 251 files

85e3d20 verified 3 months ago

710 Bytes

	Improve the baseline model performance on the babyLM Benchmark.

	Summary: This shared task challenges community members to train a language model from scratch on the same amount of linguistic data available to a child. Submissions should be implemented in Huggingface's Transformers library and will be evaluated on a shared pipeline. This shared task is co-sponsored by CMCL and CoNLL.

	To run the baseline model, execute train.py. It will train a standard gpt2 model on the babyLM data. The final model will be saved to output/ folder.

	When you submit your final answer, you will be evaluated on the performance of the checkpoint saved in the output folder. It will be evaluated on a held-out test set.