bart_lfqa / README.md
vblagoje's picture
Blog post link
5493d5b
---
language: en
datasets:
- vblagoje/lfqa
- vblagoje/lfqa_support_docs
license: mit
---
## Introduction
See [blog post](https://towardsdatascience.com/long-form-qa-beyond-eli5-an-updated-dataset-and-approach-319cb841aabb) for more details.
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModel, AutoModelForSeq2SeqLM
model_name = "vblagoje/bart_lfqa"
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
model = model.to(device)
# it all starts with a question/query
query = "Why does water heated to room temperature feel colder than the air around it?"
# given the question above suppose these documents below were found in some document store
documents = ["when the skin is completely wet. The body continuously loses water by...",
"at greater pressures. There is an ambiguity, however, as to the meaning of the terms 'heating' and 'cooling'...",
"are not in a relation of thermal equilibrium, heat will flow from the hotter to the colder, by whatever pathway...",
"air condition and moving along a line of constant enthalpy toward a state of higher humidity. A simple example ...",
"Thermal contact conductance In physics, thermal contact conductance is the study of heat conduction between solid ..."]
# concatenate question and support documents into BART input
conditioned_doc = "<P> " + " <P> ".join([d for d in documents])
query_and_docs = "question: {} context: {}".format(query, conditioned_doc)
model_input = tokenizer(query_and_docs, truncation=True, padding=True, return_tensors="pt")
generated_answers_encoded = model.generate(input_ids=model_input["input_ids"].to(device),
attention_mask=model_input["attention_mask"].to(device),
min_length=64,
max_length=256,
do_sample=False,
early_stopping=True,
num_beams=8,
temperature=1.0,
top_k=None,
top_p=None,
eos_token_id=tokenizer.eos_token_id,
no_repeat_ngram_size=3,
num_return_sequences=1)
tokenizer.batch_decode(generated_answers_encoded, skip_special_tokens=True,clean_up_tokenization_spaces=True)
# below is the abstractive answer generated by the model
["When you heat water to room temperature, it loses heat to the air around it. When you cool it down, it gains heat back from the air, which is why it feels colder than the air surrounding it. It's the same reason why you feel cold when you turn on a fan. The air around you is losing heat, and the water is gaining heat."]
```
## Author
- Vladimir Blagojevic: `dovlex [at] gmail.com` [Twitter](https://twitter.com/vladblagoje) | [LinkedIn](https://www.linkedin.com/in/blagojevicvladimir/)