Compatible small models for speculative decoding?
#9 opened 2 days ago
by
treehugg3
How many GPU ram needed?
1
#8 opened about 1 month ago
by
RaidXD
q8 with 8 part
#7 opened about 2 months ago
by
sdyy
Q6_K vs. Q5_K_L
3
#6 opened 2 months ago
by
AIGUYCONTENT
Unable to pull in from Ollama
5
#3 opened 2 months ago
by
AIGUYCONTENT
Observation: 4-bit quantization can't answer the Strawberry prompt
12
#2 opened 2 months ago
by
ThePabli
Nemotron 51B too please
4
#1 opened 3 months ago
by
nacs