ARM quants

by EloyOn - opened Oct 2, 2024

EloyOn

Oct 2, 2024

•

Will you considerer adding i8mm (q4_0_4_8) quants, like Bartowski is doing to all the new models he quants?

With those, you can run a 12B on a 16GB RAM smartphone at around 5 t/s.

Thank you for your quants.

Owner Oct 2, 2024

•

I will try to upload these as well if I can. Generally upon request I get a better taste of is going to be more useful for people.

Update:
Uploading the ARM friendly quants!

EloyOn

Oct 2, 2024

Thank you, you are a hero. I'm sure that users who run AI's on their smartphones will appreciate it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment