MisterAI's picture
Update README.md
ce7300f verified
metadata
language: fr
license: mit
datasets:
  - oscar

Basé Sur Le Modèle Originale :

Almanach/camembert-base :

"A Tasty French Language Model"

Lien : https://huggingface.co/almanach/camembert-base

Pre-trained models by almanach/camembert-base

https://huggingface.co/almanach/camembert-base/blob/main/README.md#pre-trained-models

Model #params Arch. Training data
camembert-base 110M Base OSCAR (138 GB of text)
camembert/camembert-large 335M Large CCNet (135 GB of text)
camembert/camembert-base-ccnet 110M Base CCNet (135 GB of text)
camembert/camembert-base-wikipedia-4gb 110M Base Wikipedia (4 GB of text)
camembert/camembert-base-oscar-4gb 110M Base Subsample of OSCAR (4 GB of text)
camembert/camembert-base-ccnet-4gb 110M Base Subsample of CCNet (4 GB of text)

Fine-Tunning by MisterAI/ALMANACH_CamemBERT_Agent001

Testing Training/FineTunning For Now >:)

Model #params Arch. Training data
MisterAI/ALMANACH_CamemBERT_Agent001 based on camembert-base 110M Base MisterAI/SimpleSmallFrenchQA (50 KB of text)

If you use our work, please cite:

@inproceedings{martin2020camembert, title={CamemBERT: a Tasty French Language Model}, author={Martin, Louis and Muller, Benjamin and Su{'a}rez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, {'E}ric Villemonte and Seddah, Djam{'e} and Sagot, Beno{^\i}t}, booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics}, year={2020} }