ImportError: cannot import name 'cuda_utils' from partially initialized module 'vllm' (most likely due to a circular import)
I followed the vllm documentation and this code exactly as I tried to start AquilaChat2-34B-16K-AWQ, but it returned an error.
from vllm import LLM, SamplingParams
prompts = [
"Tell me about AI",
"Write a story about llamas",
"What is 291 - 150?",
"How much wood would a woodchuck chuck if a woodchuck could chuck wood?",
]
prompt_template=f'''System: A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
Human: {prompt}
Assistant:
'''
prompts = [prompt_template.format(prompt=prompt) for prompt in prompts]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="TheBloke/AquilaChat2-34B-16K-AWQ", quantization="awq", dtype="auto")
outputs = llm.generate(prompts, sampling_params)
Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
How can I deal with this
ImportError: cannot import name 'cuda_utils' from partially initialized module 'vllm' (most likely due to a circular import) (/home/ps/app/edison/vllm/vllm/init.py)