This is the model checkpoint release for Amuro \& Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models. All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are `olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}`. To load a specific model checkpoint, use the following command. ``` model = AutoModelForCausalLM.from_pretrained( model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b", trust_remote_code=trust_remote_code, revision="your revision" ) ``` All the checkpoints are fine-tuned based on the checkpoints of [OLMo1b-HF](https://huggingface.co/allenai/OLMo-1B-hf). Citation: ``` @misc{sun2024amurocharanalyzing, title={Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models}, author={Kaiser Sun and Mark Dredze}, year={2024}, eprint={2408.06663}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.06663}, } ``` --- license: apache-2.0 ---