AlgorithmicResearchGroup
/

flan-t5-base-arxiv-cs-ml-question-answering

text2text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

flan-t5-base-arxiv-cs-ml-question-answering / README.md

ArtifactAI

Update README.md

4df1d15 about 1 year ago

|

history blame contribute delete

2.81 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: summarization
	widget:
	- text: What is an LSTM?
	example_title: Question Answering
	tags:
	- arxiv
	---
	# Table of Contents

	0. [TL;DR](#TL;DR)
	1. [Model Details](#model-details)
	2. [Usage](#usage)
	3. [Uses](#uses)
	4. [Citation](#citation)

	# TL;DR

	This is a FLAN-T5 model trained on [ArtifactAI/arxiv-cs-ml-instruct-tune-50k](https://huggingface.co/datasets/ArtifactAI/arxiv-cs-ml-instruct-tune-50k). This model is for research purposes only and *should not be used in production settings*. The output is highly unreliable.

	# Model Details

	## Model Description


	- Model type: Language model
	- Language(s) (NLP): English
	- License: Apache 2.0
	- Related Models: [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)

	# Usage

	Find below some example scripts on how to use the model in `transformers`:

	## Using the Pytorch model

	### Running the model on a CPU


	```python

	from transformers import T5Tokenizer, T5ForConditionalGeneration

	tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
	model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")

	input_text = "What is an LSTM?"
	input_ids = tokenizer(input_text, return_tensors="pt").input_ids

	outputs = model.generate(input_ids)
	print(tokenizer.decode(outputs[0]))
	```


	### Running the model on a GPU


	```python
	# pip install accelerate
	from transformers import T5Tokenizer, T5ForConditionalGeneration

	tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
	model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering", device_map="auto")

	input_text = "What is an LSTM?"
	input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

	outputs = model.generate(input_ids)
	print(tokenizer.decode(outputs[0]))
	```


	### Running the model in an HF pipeline

	#### FP16


	```python
	# load model and tokenizer from huggingface hub with pipeline
	qa = pipeline("summarization", model="ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")


	query = "What is an LSTM?"
	print(f"query: {query}")
	res = qa("answer: " + query)

	print(f"{res[0]['summary_text']}")

	```


	# Training Details

	## Training Data

	The model was trained on [ArtifactAI/arxiv-cs-ml-instruct-tune-50k](https://huggingface.co/datasets/ArtifactAI/arxiv-cs-ml-instruct-tune-50k), a dataset of question/answer pairs. Questions are generated using the t5-base model, while the answers are generated using the GPT-3.5-turbo model.

	# Citation

	```
	@misc{flan-t5-base-arxiv-cs-ml-question-answering,
	title={flan-t5-base-arxiv-cs-ml-question-answering},
	author={Matthew Kenney},
	year={2023}
	}
	```