--- language: fr license: mit tags: - deberta-v2 - question-answering base_model: almanach/camembertav2-base datasets: - FQuAD metrics: - accuracy pipeline_tag: question-answering library_name: transformers model-index: - name: almanach/camembertav2-base-fquad results: - task: type: question-answering name: FQuAD dataset: type: FQuAD name: FQuAD metrics: - name: f1 type: f1 value: 83.36016 verified: false - name: Extact Match type: em value: 64.42911 verified: false --- # Model Card for almanach/camembertav2-base-fquad almanach/camembertav2-base-fquad is a deberta-v2 model for question answering. It is trained on the FQuAD dataset for the task of Extractive Question Answering. The model achieves an f1-score of 83.36016 on the FQuAD dataset. The model is part of the almanach/camembertav2-base family of model finetunes. ## Model Details ### Model Description - **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris) - **Model type:** deberta-v2 - **Language(s) (NLP):** French - **License:** MIT - **Finetuned from model :** almanach/camembertav2-base ### Model Sources - **Repository:** https://github.com/WissamAntoun/camemberta - **Paper:** https://arxiv.org/abs/2411.08868 ## Uses The model can be used for question answering tasks in French for Extractive Question Answering. ## Bias, Risks, and Limitations The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline model = AutoModelForQuestionAnswering.from_pretrained("almanach/camembertav2-base-fquad") tokenizer = AutoTokenizer.from_pretrained("almanach/camembertav2-base-fquad") classifier = pipeline("question-answering", model=model, tokenizer=tokenizer) classifier(question="Quelle est la capitale de la France ?", context="La capitale de la France est Paris.") ``` ## Training Details ### Training Data The model is trained on the FQuAD dataset. - Dataset Name: FQuAD - Dataset Size: - Train: 20731 - Dev: 3188 ### Training Procedure Model trained with the run_qa.py script from the huggingface repository. #### Training Hyperparameters ```yml 'Unnamed: 0': /scratch/camembertv2/runs/results/fquad/camembertav2-base-bf16-p2-17000/max_seq_length-896-doc_stride-128-max_answer_length-30-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1/all_results.json accelerator_config: '{''split_batches'': False, ''dispatch_batches'': None, ''even_batches'': True, ''use_seedable_sampler'': True, ''non_blocking'': False, ''gradient_accumulation_kwargs'': None}' adafactor: false adam_beta1: 0.9 adam_beta2: 0.999 adam_epsilon: 1.0e-08 auto_find_batch_size: false base_model: camembertv2 base_model_name: camembertav2-base-bf16-p2-17000 batch_eval_metrics: false bf16: false bf16_full_eval: false data_seed: 1.0 dataloader_drop_last: false dataloader_num_workers: 0 dataloader_persistent_workers: false dataloader_pin_memory: true dataloader_prefetch_factor: .nan ddp_backend: .nan ddp_broadcast_buffers: .nan ddp_bucket_cap_mb: .nan ddp_find_unused_parameters: .nan ddp_timeout: 1800 debug: '[]' deepspeed: .nan disable_tqdm: false dispatch_batches: .nan do_eval: true do_predict: false do_train: true epoch: 6.0 eval_accumulation_steps: 1 eval_delay: 0 eval_do_concat_batches: true eval_exact_match: 64.42910915934755 eval_f1: 83.36016013340664 eval_on_start: false eval_runtime: 45.7589 eval_samples: 3188.0 eval_samples_per_second: 69.669 eval_steps: .nan eval_steps_per_second: 1.093 eval_strategy: epoch eval_use_gather_object: false evaluation_strategy: epoch fp16: false fp16_backend: auto fp16_full_eval: false fp16_opt_level: O1 fsdp: '[]' fsdp_config: '{''min_num_params'': 0, ''xla'': False, ''xla_fsdp_v2'': False, ''xla_fsdp_grad_ckpt'': False}' fsdp_min_num_params: 0 fsdp_transformer_layer_cls_to_wrap: .nan full_determinism: false gradient_accumulation_steps: 2 gradient_checkpointing: false gradient_checkpointing_kwargs: .nan greater_is_better: true group_by_length: false half_precision_backend: auto hub_always_push: false hub_model_id: .nan hub_private_repo: false hub_strategy: every_save hub_token: ignore_data_skip: false include_inputs_for_metrics: false include_num_input_tokens_seen: false include_tokens_per_second: false jit_mode_eval: false label_names: .nan label_smoothing_factor: 0.0 learning_rate: 3.0e-05 length_column_name: length load_best_model_at_end: true local_rank: 0 log_level: debug log_level_replica: warning log_on_each_node: true logging_dir: /scratch/camembertv2/runs/results/fquad/camembertav2-base-bf16-p2-17000/max_seq_length-896-doc_stride-128-max_answer_length-30-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1/logs logging_first_step: false logging_nan_inf_filter: true logging_steps: 100 logging_strategy: steps lr_scheduler_kwargs: '{}' lr_scheduler_type: cosine max_grad_norm: 1.0 max_steps: -1 metric_for_best_model: exact_match mp_parameters: .nan name: camembertv2/runs/results/fquad/camembertav2-base-bf16-p2-17000/max_seq_length-896-doc_stride-128-max_answer_length-30-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0 neftune_noise_alpha: .nan no_cuda: false num_train_epochs: 6.0 optim: adamw_torch optim_args: .nan optim_target_modules: .nan output_dir: /scratch/camembertv2/runs/results/fquad/camembertav2-base-bf16-p2-17000/max_seq_length-896-doc_stride-128-max_answer_length-30-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1 overwrite_output_dir: false past_index: -1 per_device_eval_batch_size: 64 per_device_train_batch_size: 8 per_gpu_eval_batch_size: .nan per_gpu_train_batch_size: .nan prediction_loss_only: false push_to_hub: false push_to_hub_model_id: .nan push_to_hub_organization: .nan push_to_hub_token: ray_scope: last remove_unused_columns: true report_to: '[''tensorboard'']' restore_callback_states_from_checkpoint: false resume_from_checkpoint: .nan run_name: camembertav2-base-bf16-p2-17000 save_on_each_node: false save_only_model: false save_safetensors: true save_steps: 500 save_strategy: epoch save_total_limit: .nan seed: 1 skip_memory_metrics: true split_batches: .nan tf32: .nan torch_compile: true torch_compile_backend: inductor torch_compile_mode: .nan torch_empty_cache_steps: .nan torchdynamo: .nan total_flos: 2.0394634246921464e+16 tpu_metrics_debug: false tpu_num_cores: .nan train_loss: 0.5145930189164087 train_runtime: 3736.1381 train_samples: 20731 train_samples_per_second: 33.293 train_steps_per_second: 2.081 use_cpu: false use_ipex: false use_legacy_prediction_loop: false use_mps_device: false warmup_ratio: 0.0 warmup_steps: 0 weight_decay: 0.0 ``` #### Results **F1-Score:** 83.36016 **Exact Match:** 64.42911 ## Technical Specifications ### Model Architecture and Objective deberta-v2 for extractive question answering in French. ## Citation **BibTeX:** ```bibtex @misc{antoun2024camembert20smarterfrench, title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection}, author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah}, year={2024}, eprint={2411.08868}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.08868}, } ```