ValueError: Block pattern could not be match. Pass `block_name_to_quantize` argument in `quantize_model`

#36 opened 9 months ago by

Gonzalomoreno01

Some weights of the model checkpoint at Llama-2-7B-Chat-GPTQ were not used when initializing LlamaForCausalLM

#35 opened 10 months ago by

thlw

[AUTOMATED] Model Memory Requirements

#34 opened 12 months ago by

model-sizer-bot

[AUTOMATED] Model Memory Requirements

#33 opened 12 months ago by

model-sizer-bot

[AUTOMATED] Model Memory Requirements

#32 opened 12 months ago by

model-sizer-bot

Fails with transformers==4.38.1

#30 opened about 1 year ago by

rohithkrn

Index out of range error: QAchain for pdf chatbot

#29 opened about 1 year ago by

Artemis3196

How to overcoming bad output for better results?

#28 opened over 1 year ago by

notmax123

Functional example of finetuning of Llama-2-7b-Chat-GPTQ

#26 opened over 1 year ago by

echogit

AUTOGPTQ Error in Google Colab

#25 opened over 1 year ago by

echogit

Does the model response correctly

#24 opened over 1 year ago by

mnwato

Can't Load Model in Kubernetes but can in Docker

#23 opened over 1 year ago by

jrsperry

TheBloke/Llama-2-7b-(Chat-)GPTQ repeats request

#22 opened over 1 year ago by

hyzhak

Cannot run batch on transformer

#20 opened over 1 year ago by

DatenlaborBerlin

The response is not formatted

#18 opened over 1 year ago by

Octavian81

how to load the GPTQ model using any pipeline method

#17 opened over 1 year ago by

harithushan

Error trying to run on a revision, tensors not conforming?

#16 opened over 1 year ago by

JohnSnyderTC

for faster GPU inference

#15 opened over 1 year ago by

harithushan

How to set it up in a way that it just returns output without the system message or query, basically the information after [/INST].

#14 opened over 1 year ago by

Pavan-growexx

Update for Transformers GPTQ support

#13 opened over 1 year ago by

TheBloke

LORA fine tuning error

#12 opened over 1 year ago by

tongwuhugging

GPTQ bugging: Wondering if I'm loading the model correctly

#9 opened over 1 year ago by

quantuan125

Please make this model quantised GPTQ

#7 opened over 1 year ago by

AiModelsMarket

TGI error

#5 opened over 1 year ago by

aiamateur101

Cannot use anything but what's in the main branch

#3 opened over 1 year ago by

HAvietisov

How to use this GPTQ Model from Python code for continue conversation?

#2 opened over 1 year ago by

shifa

"max_length": 4096, "max_position_embeddings": 4096,

#1 opened over 1 year ago by

pseudotensor