File size: 1,125 Bytes
906f2c5 32a0011 906f2c5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
This is the model checkpoint release for Amuro \& Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.
All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are `olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}`.
To load a specific model checkpoint, use the following command.
```
model = AutoModelForCausalLM.from_pretrained(
model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b",
trust_remote_code=trust_remote_code,
revision="your revision"
)
```
All the checkpoints are fine-tuned based on the checkpoints of [OLMo1b-HF](https://huggingface.co/allenai/OLMo-1B-hf).
Citation:
```
@misc{sun2024amurocharanalyzing,
title={Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models},
author={Kaiser Sun and Mark Dredze},
year={2024},
eprint={2408.06663},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.06663},
}
```
---
license: apache-2.0
---
|