YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is the model checkpoint release for Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models.

All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}. To load a specific model checkpoint, use the following command.

model = AutoModelForCausalLM.from_pretrained(
                model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b",
                trust_remote_code=trust_remote_code,
                revision="your revision"
            )

All the checkpoints are fine-tuned based on the checkpoints of OLMo1b-HF.

Citation:

@misc{sun2024amurocharanalyzing,
      title={Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models}, 
      author={Kaiser Sun and Mark Dredze},
      year={2024},
      eprint={2408.06663},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.06663}, 
}

license: apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.