KaiserWhoLearns
/

PTvsSFT_OLMo1b

Model card Files Files and versions Community

PTvsSFT_OLMo1b / README.md

IAMJB's picture

IAMJB HF staff

Update README.md

32a0011 verified 4 months ago

|

1.13 kB

	This is the model checkpoint release for Amuro \& Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.

	All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are `olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}`.
	To load a specific model checkpoint, use the following command.
	```
	model = AutoModelForCausalLM.from_pretrained(
	model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b",
	trust_remote_code=trust_remote_code,
	revision="your revision"
	)
	```

	All the checkpoints are fine-tuned based on the checkpoints of [OLMo1b-HF](https://huggingface.co/allenai/OLMo-1B-hf).

	Citation:
	```
	@misc{sun2024amurocharanalyzing,
	title={Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models},
	author={Kaiser Sun and Mark Dredze},
	year={2024},
	eprint={2408.06663},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2408.06663},
	}
	```

	---
	license: apache-2.0
	---