Qwen
/

Qwen1.5-14B-Chat-GPTQ-Int4

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Resources

View closed (1)

[AUTOMATED] Model Memory Requirements

#4 opened 8 months ago by

model-sizer-bot

Is fast attention supported?

#2 opened 9 months ago by

can't run with fastchat cuda 12.1

#1 opened 9 months ago by