metadata
license: apache-2.0
language: en
tags:
- generated_from_trainer
datasets:
- squad_v2
model-index:
- name: albert-base-v2-squad_v2
results:
- task:
name: Question Answering
type: question-answering
dataset:
type: squad_v2
name: The Stanford Question Answering Dataset
args: en
metrics:
- type: eval_exact
value: 78.8175
- type: eval_f1
value: 81.9984
- type: eval_HasAns_exact
value: 75.3374
- type: eval_HasAns_f1
value: 81.7083
- type: eval_NoAns_exact
value: 82.2876
- type: eval_NoAns_f1
value: 82.2876
albert-base-v2-squad_v2
This model is a fine-tuned version of albert-base-v2 on the squad_v2 dataset.
Model description
This model is fine-tuned on the extractive question answering task -- The Stanford Question Answering Dataset -- SQuAD2.0.
For convenience this model is prepared to be used with the frameworks PyTorch
, Tensorflow
and ONNX
.
Intended uses & limitations
This model can handle mismatched question-context pairs. Make sure to specify handle_impossible_answer=True
when using QuestionAnsweringPipeline
.
Example usage:
>>> from transformers import AutoModelForQuestionAnswering, AutoTokenizer, QuestionAnsweringPipeline
>>> model = AutoModelForQuestionAnswering.from_pretrained("squirro/albert-base-v2-squad_v2")
>>> tokenizer = AutoTokenizer.from_pretrained("squirro/albert-base-v2-squad_v2")
>>> qa_model = QuestionAnsweringPipeline(model, tokenizer)
>>> qa_model(
>>> question="What's your name?",
>>> context="My name is Clara and I live in Berkeley.",
>>> handle_impossible_answer=True # important!
>>> )
{'score': 0.9027367830276489, 'start': 11, 'end': 16, 'answer': 'Clara'}
Training and evaluation data
Training and evaluation was done on SQuAD2.0.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- distributed_type: tpu
- num_devices: 8
- total_train_batch_size: 256
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
key | value |
---|---|
epoch | 3 |
eval_HasAns_exact | 75.3374 |
eval_HasAns_f1 | 81.7083 |
eval_HasAns_total | 5928 |
eval_NoAns_exact | 82.2876 |
eval_NoAns_f1 | 82.2876 |
eval_NoAns_total | 5945 |
eval_best_exact | 78.8175 |
eval_best_exact_thresh | 0 |
eval_best_f1 | 81.9984 |
eval_best_f1_thresh | 0 |
eval_exact | 78.8175 |
eval_f1 | 81.9984 |
eval_samples | 12171 |
eval_total | 11873 |
train_loss | 0.775293 |
train_runtime | 1402 |
train_samples | 131958 |
train_samples_per_second | 282.363 |
train_steps_per_second | 1.104 |
Framework versions
- Transformers 4.18.0.dev0
- Pytorch 1.9.0+cu111
- Datasets 1.18.3
- Tokenizers 0.11.6