File size: 3,188 Bytes
a2cc140
e9839d3
 
478aa2f
e9839d3
 
 
bdfca93
 
478aa2f
 
 
 
da11330
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a2cc140
e9839d3
61772bf
e9839d3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da11330
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e9839d3
 
 
 
 
 
da11330
 
e9839d3
 
 
 
 
 
da11330
e9839d3
 
 
da11330
 
e9839d3
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
language:
- en
license: mit
tags:
- bart
- question-answering
- squad
- squad_v2
datasets:
- squad_v2
- squad
base_model: facebook/bart-base
model-index:
- name: sjrhuschlee/bart-base-squad2
  results:
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad_v2
      type: squad_v2
      config: squad_v2
      split: validation
    metrics:
    - type: exact_match
      value: 75.223
      name: Exact Match
    - type: f1
      value: 78.443
      name: F1
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad
      type: squad
      config: plain_text
      split: validation
    metrics:
    - type: exact_match
      value: 83.406
      name: Exact Match
    - type: f1
      value: 90.377
      name: F1
---

# bart-base for Extractive QA

This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset.

## Overview
**Language model:** bart-base  
**Language:** English  
**Downstream-task:** Extractive QA  
**Training data:** SQuAD 2.0  
**Eval data:** SQuAD 2.0  
**Infrastructure**: 1x NVIDIA 3070 


## Model Usage
```python
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "sjrhuschlee/bart-base-squad2"
# a) Using pipelines
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
qa_input = {
'question': 'Where do I live?',
'context': 'My name is Sarah and I live in London'
}
res = nlp(qa_input)
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```

## Metrics

```bash
# Squad v2
{
    "eval_HasAns_exact": 76.45074224021593,
    "eval_HasAns_f1": 82.88605283171232,
    "eval_HasAns_total": 5928,
    "eval_NoAns_exact": 74.01177460050462,
    "eval_NoAns_f1": 74.01177460050462,
    "eval_NoAns_total": 5945,
    "eval_best_exact": 75.23793481007327,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 78.45098300230696,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 75.22951233892024,
    "eval_f1": 78.44256053115387,
    "eval_runtime": 131.875,
    "eval_samples": 11955,
    "eval_samples_per_second": 90.654,
    "eval_steps_per_second": 3.784,
    "eval_total": 11873
}

# Squad
{
    "eval_exact_match": 83.40586565752129,
    "eval_f1": 90.37706849113668,
    "eval_runtime": 117.2093,
    "eval_samples": 10619,
    "eval_samples_per_second": 90.599,
    "eval_steps_per_second": 3.78
}
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- max_seq_length 512
- doc_stride 128
- learning_rate: 2e-06
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 6
- total_train_batch_size: 96
- optimizer: Adam8Bit with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 4.0
- gradient_checkpointing: True
- tf32: True


### Framework versions

- Transformers 4.30.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3