Text Generation
Transformers
PyTorch
gptj
Inference Endpoints
instruct-gpt-j-fp16 / README.md
juliensalinas's picture
Update README.md
cf9d94b
|
raw
history blame
3.03 kB
metadata
license: gpl-3.0

Description

This model demonstrates that GPT-J can work perfectly well as an "instruct" model when properly fine-tuned. It is an fp16 version that makes it easy to deploy the model on entry level GPU like an NVIDIA Tesla T4. Want to know more about NLP Cloud? Have a look at our platform here.

We fine-tuned GPT-J on an instruction dataset created by the Stanford Alpaca team. You can find the original dataset here.

The dataset was slightly reworked in order to match the GPT-J fine-tuning format with Mesh Transformer Jax on TPUs. Here is the final dataset we used.

The base GPT-J models needs few-shot learning in order to properly understand what you want. See more details here about how to properly use few-shot learning. For example let's say that you want to correct spelling with GPT-J. Here is an example of a prompt you had to use:

I love goin to the beach.
Correction: I love going to the beach.
###
Let me hav it!
Correction: Let me have it!
###
It have too many drawbacks.
Correction: It has too many drawbacks.
###
I do not wan to go
Correction:

Now, with Instruct GPT-J, you can ask things in natural language "like a human":

Correct spelling and grammar from the following text.
I do not wan to go

Which returns the following:

I do not want to go.

You can also perfectly keep using few-shot learning on this model for very advanced use cases.

How To Use The Model?

Using the model in fp16 with the text generation pipeline, here is what you can do:

from transformers import pipeline
import torch

generator = pipeline(model="nlpcloud/instruct-gpt-j", torch_dtype=torch.float16, device=0)

prompt = "Correct spelling and grammar from the following text.\nI do not wan to go"

print(generator(prompt))

You can also use the generate() function. Here is what you can do:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained('nlpcloud/instruct-gpt-j')
generator = AutoModelForCausalLM.from_pretrained("nlpcloud/instruct-gpt-j",torch_dtype=torch.float16).cuda()

prompt = "Correct spelling and grammar from the following text.\nI do not wan to go"

inputs = tokenizer(prompt, return_tensors='pt')
outputs = generator.generate(inputs.input_ids.cuda())

print(tokenizer.decode(outputs[0]))

Hardware Requirements

This model is an fp16 version of our fine-tuned model, which works very well on a GPU with 16GB of VRAM like an NVIDIA Tesla T4.

We did not notice any difference between the fp32 and fp16 versions in terms of quality.