Spaces:

yusufs
/

vllm-inference

Paused

App Files Files

vllm-inference / download_model.py

Commit History

feat(add-model): always download model during build, it will be cached in the consecutive builds

8679a35

yusufs commited on Nov 27, 2024

feat(reduce-max-num-batched-tokens): Reducing max-num-batched-tokens even the error state it want to reduce max_model_len

13a5c22

yusufs commited on Nov 27, 2024

feat(hf_token): set hf token during build

493a5f1

yusufs commited on Nov 27, 2024

feat(download-model): add download model at runtime

fc30f26

yusufs commited on Nov 27, 2024