--- language: fr license: mit datasets: - oscar --- # Basé Sur Le Modèle Originale : ## Almanach/camembert-base : ### "A Tasty French Language Model" ## Lien : https://huggingface.co/almanach/camembert-base ## Pre-trained models by almanach/camembert-base ### https://huggingface.co/almanach/camembert-base/blob/main/README.md#pre-trained-models | Model | #params | Arch. | Training data | |--------------------------------|--------------------------------|-------|-----------------------------------| | `camembert-base` | 110M | Base | OSCAR (138 GB of text) | | `camembert/camembert-large` | 335M | Large | CCNet (135 GB of text) | | `camembert/camembert-base-ccnet` | 110M | Base | CCNet (135 GB of text) | | `camembert/camembert-base-wikipedia-4gb` | 110M | Base | Wikipedia (4 GB of text) | | `camembert/camembert-base-oscar-4gb` | 110M | Base | Subsample of OSCAR (4 GB of text) | | `camembert/camembert-base-ccnet-4gb` | 110M | Base | Subsample of CCNet (4 GB of text) | ## Fine-Tunning by MisterAI/ALMANACH_CamemBERT_Agent001 Testing Training/FineTunning For Now >:) | Model | #params | Arch. | Training data | |--------------------------------|--------------------------------|-------|-----------------------------------| | `MisterAI/ALMANACH_CamemBERT_Agent001` based on `camembert-base`| 110M | Base | MisterAI/SimpleSmallFrenchQA (50 KB of text) | ************ If you use our work, please cite: @inproceedings{martin2020camembert, title={CamemBERT: a Tasty French Language Model}, author={Martin, Louis and Muller, Benjamin and Su{\'a}rez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, {\'E}ric Villemonte and Seddah, Djam{\'e} and Sagot, Beno{\^\i}t}, booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics}, year={2020} } *************