Replication of Inference API
#63
by
henningheyen
- opened
Hi everyone.
I want to use gpt2 for sentiment classification using a few-shot approach (see screenshot). I need just one word to be generated either positive or negative. Using the inference API on the website works well, but I find replicating it difficult. What happens behind the scene? I am using GPT2LMHeadModel, GPT2Tokenizer and the forward() method. Also, inference time seems much faster online than locally.
Any help is well appreciated. Thank you