File size: 7,959 Bytes
509b36d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
---
language: fr
license: mit
tags:
- roberta
- text-classification
- review-classification
base_model: almanach/camembertv2-base
datasets:
- FLUE-CLS
metrics:
- accuracy
pipeline_tag: text-classification
library_name: transformers
widget:
# example for the french classification model
- text: "Le livre est très intéressant et j'ai appris beaucoup de choses."
  example_title: Books Review
- text: "Le film était ennuyeux et je n'ai pas aimé les acteurs."
  example_title: DVD Review
- text: "La musique était très bonne et j'ai adoré les paroles."
  example_title: Music Review
model-index:
- name: almanach/camembertv2-base-cls
  results:
  - task:
      type: text-classification
      name: Amazon Review Classification
    dataset:
      type: flue-cls
      name: FLUE-CLS
    metrics:
    - name: accuracy
      type: accuracy
      value: 0.95199
      verified: false
---

# Model Card for almanach/camembertv2-base-cls

almanach/camembertv2-base-cls is a roberta model for text classification. It is trained on the FLUE-CLS dataset for the task of Amazon Review Classification. The model achieves an accuracy of 0.95199 on the FLUE-CLS dataset.

The model is part of the almanach/camembertv2-base family of model finetunes.

## Model Details

### Model Description

- **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris)
- **Model type:** roberta
- **Language(s) (NLP):** French
- **License:** MIT
- **Finetuned from model [optional]:** almanach/camembertv2-base

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/WissamAntoun/camemberta
- **Paper:** https://arxiv.org/abs/2411.08868

## Uses

The model can be used for text classification tasks in French of Movie, Music, and Book reviews from Amazon.

## Bias, Risks, and Limitations

The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on.


## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model = AutoModelForSequenceClassification.from_pretrained("almanach/camembertv2-base-cls")
tokenizer = AutoTokenizer.from_pretrained("almanach/camembertv2-base-cls")

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

classifier("Le livre est très intéressant et j'ai appris beaucoup de choses.")
```


## Training Details

### Training Data

The model is trained on the FLUE-CLS dataset.

- Dataset Name: FLUE-CLS
- Dataset Size:
    - Train: 5997
    - Test: 5999


### Training Procedure

Model trained with the run_classification.py script from the huggingface repository.



#### Training Hyperparameters

```yml
accelerator_config: '{''split_batches'': False, ''dispatch_batches'': None, ''even_batches'':
  True, ''use_seedable_sampler'': True, ''non_blocking'': False, ''gradient_accumulation_kwargs'':
  None}'
adafactor: false
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1.0e-08
auto_find_batch_size: false
base_model: camembertv2
base_model_name: camembertv2-base-bf16-p2-17000
batch_eval_metrics: false
bf16: false
bf16_full_eval: false
data_seed: 1.0
dataloader_drop_last: false
dataloader_num_workers: 0
dataloader_persistent_workers: false
dataloader_pin_memory: true
dataloader_prefetch_factor: .nan
ddp_backend: .nan
ddp_broadcast_buffers: .nan
ddp_bucket_cap_mb: .nan
ddp_find_unused_parameters: .nan
ddp_timeout: 1800
debug: '[]'
deepspeed: .nan
disable_tqdm: false
dispatch_batches: .nan
do_eval: true
do_predict: false
do_train: true
epoch: 5.984
eval_accumulation_steps: 4
eval_accuracy: 0.9519919986664444
eval_delay: 0
eval_do_concat_batches: true
eval_loss: 0.2167392075061798
eval_on_start: false
eval_runtime: 52.247
eval_samples: 5999
eval_samples_per_second: 114.82
eval_steps: .nan
eval_steps_per_second: 14.355
eval_strategy: epoch
eval_use_gather_object: false
evaluation_strategy: epoch
fp16: false
fp16_backend: auto
fp16_full_eval: false
fp16_opt_level: O1
fsdp: '[]'
fsdp_config: '{''min_num_params'': 0, ''xla'': False, ''xla_fsdp_v2'': False, ''xla_fsdp_grad_ckpt'':
  False}'
fsdp_min_num_params: 0
fsdp_transformer_layer_cls_to_wrap: .nan
full_determinism: false
gradient_accumulation_steps: 4
gradient_checkpointing: false
gradient_checkpointing_kwargs: .nan
greater_is_better: true
group_by_length: false
half_precision_backend: auto
hub_always_push: false
hub_model_id: .nan
hub_private_repo: false
hub_strategy: every_save
hub_token: <HUB_TOKEN>
ignore_data_skip: false
include_inputs_for_metrics: false
include_num_input_tokens_seen: false
include_tokens_per_second: false
jit_mode_eval: false
label_names: .nan
label_smoothing_factor: 0.0
learning_rate: 3.0e-05
length_column_name: length
load_best_model_at_end: true
local_rank: 0
log_level: debug
log_level_replica: warning
log_on_each_node: true
logging_dir: /scratch/camembertv2/runs/results/flue-CLS/camembertv2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1/logs
logging_first_step: false
logging_nan_inf_filter: true
logging_steps: 100
logging_strategy: steps
lr_scheduler_kwargs: '{}'
lr_scheduler_type: cosine
max_grad_norm: 1.0
max_steps: -1
metric_for_best_model: accuracy
mp_parameters: .nan
name: camembertv2/runs/results/flue-CLS/camembertv2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0
neftune_noise_alpha: .nan
no_cuda: false
num_train_epochs: 6.0
optim: adamw_torch
optim_args: .nan
optim_target_modules: .nan
output_dir: /scratch/camembertv2/runs/results/flue-CLS/camembertv2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1
overwrite_output_dir: false
past_index: -1
per_device_eval_batch_size: 8
per_device_train_batch_size: 8
per_gpu_eval_batch_size: .nan
per_gpu_train_batch_size: .nan
prediction_loss_only: false
push_to_hub: false
push_to_hub_model_id: .nan
push_to_hub_organization: .nan
push_to_hub_token: <PUSH_TO_HUB_TOKEN>
ray_scope: last
remove_unused_columns: true
report_to: '[''tensorboard'']'
restore_callback_states_from_checkpoint: false
resume_from_checkpoint: .nan
run_name: /scratch/camembertv2/runs/results/flue-CLS/camembertv2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-cosine-warmup_steps-0/SEED-1
save_on_each_node: false
save_only_model: false
save_safetensors: true
save_steps: 500
save_strategy: epoch
save_total_limit: .nan
seed: 1
skip_memory_metrics: true
split_batches: .nan
tf32: .nan
torch_compile: true
torch_compile_backend: inductor
torch_compile_mode: .nan
torch_empty_cache_steps: .nan
torchdynamo: .nan
total_flos: 6620464611065820.0
tpu_metrics_debug: false
tpu_num_cores: .nan
train_loss: 0.1198142634143591
train_runtime: 1269.2954
train_samples: 5997
train_samples_per_second: 28.348
train_steps_per_second: 0.884
use_cpu: false
use_ipex: false
use_legacy_prediction_loop: false
use_mps_device: false
warmup_ratio: 0.0
warmup_steps: 0
weight_decay: 0.0

```

#### Results

**Accuracy:** 0.95199

## Technical Specifications

### Model Architecture and Objective

roberta for sequence classification.

## Citation

**BibTeX:**

```bibtex
@misc{antoun2024camembert20smarterfrench,
      title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection},
      author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
      year={2024},
      eprint={2411.08868},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.08868},
}
```