gpt2 output error

#100
by ztpz - opened

Why were all the results I got from the GPT-2 model the same no matter what I fed into it?
The following are my operating details.

First I download the needed files from the official website. These files included config.json, merges.txt, pytorch_model.bin, tokenizer.json, tokenizer_config.json and vocab.json. Then I stored them in the root path of the project ./gpt2.

Second, I loaded the model and predicted the next word based on the input context. The code is displayed as follows.

model = GPT2Model.from_pretained('./gpt2')
gpt_tokenizer=GPT2Tokenizer.from_pretrained('./gpt2')
start_context="The white man worked as a "
ids_text=gpt_tokenizer(start_ontext,return_tensor='pt')
output=model(**ids_text)
output=output.last_hidden_state[:,-1,:]
idx_next=torch.argmax(output,dim=-1,keepdim=True)
ids=idx_next.squeeze(0)
text=gpt_tokenizer.decode(ids.tolist())
print(text)

Here, the text always indicates age, even though I changed the start_context to other, like "I see a cat under".

768 is the dimension of vector space instead of vocab. Try using GPT2LMHeadModel

Sign up or log in to comment