Text Generation
Transformers
PyTorch
gpt_bigcode
code
Eval Results
text-generation-inference

turn on use_cache for fast inference

#5
by loubnabnl HF staff - opened

activate use_cache to speed up inference

loubnabnl changed pull request title from Update config.json to turn on use_cache for fast inference

@loubnabnl May I ask how faster did the option boost in your case? I found little difference activating use_cache

**Oh, My bad, I already activated use_cache, boost about 9X, thanks

Ready to merge
This branch is ready to get merged automatically.
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment