File size: 1,125 Bytes
906f2c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32a0011
 
 
 
 
 
 
 
 
906f2c5
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
This is the model checkpoint release for Amuro \& Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.

All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are `olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}`.
To load a specific model checkpoint, use the following command.
```
model = AutoModelForCausalLM.from_pretrained(
                model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b",
                trust_remote_code=trust_remote_code,
                revision="your revision"
            )
```

All the checkpoints are fine-tuned based on the checkpoints of [OLMo1b-HF](https://huggingface.co/allenai/OLMo-1B-hf).

Citation:
```
@misc{sun2024amurocharanalyzing,
      title={Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models}, 
      author={Kaiser Sun and Mark Dredze},
      year={2024},
      eprint={2408.06663},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.06663}, 
}
```

---
license: apache-2.0
---