optimum-neuron-cache / inference-cache-config

dacorvo HF staff

Add more batch_size for mistral on smaller instances

545cd4d verified 9 months ago