File size: 3,188 Bytes
a2cc140 e9839d3 478aa2f e9839d3 bdfca93 478aa2f da11330 a2cc140 e9839d3 61772bf e9839d3 da11330 e9839d3 da11330 e9839d3 da11330 e9839d3 da11330 e9839d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
language:
- en
license: mit
tags:
- bart
- question-answering
- squad
- squad_v2
datasets:
- squad_v2
- squad
base_model: facebook/bart-base
model-index:
- name: sjrhuschlee/bart-base-squad2
results:
- task:
type: question-answering
name: Question Answering
dataset:
name: squad_v2
type: squad_v2
config: squad_v2
split: validation
metrics:
- type: exact_match
value: 75.223
name: Exact Match
- type: f1
value: 78.443
name: F1
- task:
type: question-answering
name: Question Answering
dataset:
name: squad
type: squad
config: plain_text
split: validation
metrics:
- type: exact_match
value: 83.406
name: Exact Match
- type: f1
value: 90.377
name: F1
---
# bart-base for Extractive QA
This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset.
## Overview
**Language model:** bart-base
**Language:** English
**Downstream-task:** Extractive QA
**Training data:** SQuAD 2.0
**Eval data:** SQuAD 2.0
**Infrastructure**: 1x NVIDIA 3070
## Model Usage
```python
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "sjrhuschlee/bart-base-squad2"
# a) Using pipelines
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
qa_input = {
'question': 'Where do I live?',
'context': 'My name is Sarah and I live in London'
}
res = nlp(qa_input)
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```
## Metrics
```bash
# Squad v2
{
"eval_HasAns_exact": 76.45074224021593,
"eval_HasAns_f1": 82.88605283171232,
"eval_HasAns_total": 5928,
"eval_NoAns_exact": 74.01177460050462,
"eval_NoAns_f1": 74.01177460050462,
"eval_NoAns_total": 5945,
"eval_best_exact": 75.23793481007327,
"eval_best_exact_thresh": 0.0,
"eval_best_f1": 78.45098300230696,
"eval_best_f1_thresh": 0.0,
"eval_exact": 75.22951233892024,
"eval_f1": 78.44256053115387,
"eval_runtime": 131.875,
"eval_samples": 11955,
"eval_samples_per_second": 90.654,
"eval_steps_per_second": 3.784,
"eval_total": 11873
}
# Squad
{
"eval_exact_match": 83.40586565752129,
"eval_f1": 90.37706849113668,
"eval_runtime": 117.2093,
"eval_samples": 10619,
"eval_samples_per_second": 90.599,
"eval_steps_per_second": 3.78
}
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- max_seq_length 512
- doc_stride 128
- learning_rate: 2e-06
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 6
- total_train_batch_size: 96
- optimizer: Adam8Bit with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 4.0
- gradient_checkpointing: True
- tf32: True
### Framework versions
- Transformers 4.30.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3 |