Post
2594
The Romulus model series has been released on Hugging Face, continually pre-trained on 34,864,949 tokens of French laws and intended to serve as a foundation for fine-tuning on labeled data 🤗
The training code, dataset and model weights are open and available free on HF and the training was based on H100 provided by Microsoft for Startups using Unsloth AI by @danielhanchen and @shimmyshimmer 🦥
Link to the base model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1
Link to the instruct model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1-Instruct
Link to the dataset: louisbrulenaudet/Romulus-cpt-fr
Please note that these models have not been aligned for the production of usable texts as they stand, and will certainly need to be refined for the desired tasks in order to produce satisfactory results.
The training code, dataset and model weights are open and available free on HF and the training was based on H100 provided by Microsoft for Startups using Unsloth AI by @danielhanchen and @shimmyshimmer 🦥
Link to the base model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1
Link to the instruct model: louisbrulenaudet/Romulus-cpt-Llama-3.1-8B-v0.1-Instruct
Link to the dataset: louisbrulenaudet/Romulus-cpt-fr
Please note that these models have not been aligned for the production of usable texts as they stand, and will certainly need to be refined for the desired tasks in order to produce satisfactory results.