Text Generation
Transformers
PyTorch
English
llama
Inference Endpoints
text-generation-inference
winglian TheBloke commited on
Commit
64c10ed
1 Parent(s): 3acb895

Set use_cache to True, otherwise inference performance is poor (#2)

Browse files

- Set use_cache to True, otherwise inference performance is poor (5e8c41ad549c0d528bceee10dad15b8dfb6feb38)


Co-authored-by: Tom Jobbins <TheBloke@users.noreply.huggingface.co>

Files changed (1) hide show
  1. config.json +1 -1
config.json CHANGED
@@ -19,6 +19,6 @@
19
  "tie_word_embeddings": false,
20
  "torch_dtype": "bfloat16",
21
  "transformers_version": "4.30.0.dev0",
22
- "use_cache": false,
23
  "vocab_size": 32000
24
  }
 
19
  "tie_word_embeddings": false,
20
  "torch_dtype": "bfloat16",
21
  "transformers_version": "4.30.0.dev0",
22
+ "use_cache": true,
23
  "vocab_size": 32000
24
  }