README.md · anthracite-org/magnum-v1-72b at 3adeae780ade5ad693423ecc17abdce8d3865c62

metadata

license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
language:
  - en
  - zh
pipeline_tag: text-generation
tags:
  - chat

This is the first in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.

Prompting

Model has been Instruct tuned with the ChatML formatting. A typical input would look like this:

"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""

Credits

This model has been a team effort, credits go to:

Sao10K for help with (and cleaning up!) the dataset.
alpindale for the training.
kalomaze for helping with the hyperparameter tuning.
Various other people for their continued help as we tuned the parameters, restarted failed runs. In no particular order: Doctor Shotgun, Lucy, Nopm, Mango, and the rest of the Silly Tilly.

And last but not least, we'd like to thank Kearm for sponsoring the compute needed to train this model.

Training

The training was done with 55 million tokens of high-quality RP data, over 1.5 epochs. We used 8x AMD Instinct™ MI300X Accelerators for the full-parameter fine-tuning of the model.

Safety

...