Compatible small models for speculative decoding?
#9 opened about 2 months ago
by
treehugg3
How many GPU ram needed?
1
#8 opened 3 months ago
by
RaidXD

q8 with 8 part
#7 opened 4 months ago
by
sdyy
Q6_K vs. Q5_K_L
3
#6 opened 4 months ago
by
AIGUYCONTENT

Unable to pull in from Ollama
5
#3 opened 4 months ago
by
AIGUYCONTENT

Observation: 4-bit quantization can't answer the Strawberry prompt
12
#2 opened 4 months ago
by
ThePabli
Nemotron 51B too please
4
#1 opened 4 months ago
by
nacs