some issues

by loveisp - opened Dec 6, 2023

Dec 6, 2023

When I try to load the model using VLLM, it consumes all of my memory (128G) and throws an Out-of-Memory (OOM) error. The pipeline from transformers library can be used, but the inference results are abnormal.

wenhu

TIGER-Lab org Dec 7, 2023

Is this OOM for CPU or GPU? It should work fine. I have the inference code in https://github.com/TIGER-AI-Lab/MAmmoTH/blob/main/requirements.txt.

loveisp

Dec 8, 2023

It is CPU OOM. The memory consumption keeps growing and eventually consumes all of my 128GB memory, which shouldn't be the case. Other models from Mammoth don't exhibit this issue.

wenhu

TIGER-Lab org Dec 9, 2023

I see. Normally inference won't take that much memory. Can you confirm whether it's from vllm or from the mistral model itself (huggingface transformers)? I think these two would be mainly the sources.

loveisp

Dec 11, 2023

I tried using huggingface's transformers instead of vllm, and I did encounter the same out-of-memory issue.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment