optimum-neuron-cache / inference-cache-config

Commit History

Added Llama-70b batch_size 4 to inference cache
593822e
verified

dacorvo HF staff commited on

Create mistral.json
b5d0afd
verified

philschmid HF staff commited on

Create gpt2.json
3bdb891
verified

philschmid HF staff commited on

Create inference-cache-config/llama.json
1960ccb
verified

philschmid HF staff commited on