@singhsidhukuldeep on Hugging Face: "Good folks from @Microsoft Research have just released bitnet.cpp, a…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

singhsidhukuldeep

posted an update 24 days ago

Post

1159

Good folks from @Microsoft Research have just released bitnet.cpp, a game-changing inference framework that achieves remarkable performance gains.

Key Technical Highlights:
- Achieves speedups of up to 6.17x on x86 CPUs and 5.07x on ARM CPUs
- Reduces energy consumption by 55.4–82.2%
- Enables running 100B parameter models at human reading speed (5–7 tokens/second) on a single CPU

Features Three Optimized Kernels:
1. I2_S: Uses 2-bit weight representation
2. TL1: Implements 4-bit index lookup tables for every two weights
3. TL2: Employs 5-bit compression for every three weights

Performance Metrics:
- Lossless inference with 100% accuracy compared to full-precision models
- Tested across model sizes from 125M to 100B parameters
- Evaluated on both Apple M2 Ultra and Intel i7-13700H processors

This breakthrough makes running large language models locally more accessible than ever, opening new possibilities for edge computing and resource-constrained environments.

deleted

24 days ago

This comment has been hidden

m-conrad-202

23 days ago

A proper link to the Microsoft page would be appreciated.

m-conrad-202

23 days ago

https://github.com/microsoft/BitNet

SerialKicked

22 days ago

•

edited 22 days ago

It's kinda misleading to say they have the same accuracy as full precision. It was only tested on one very specific 0.7B parameter model over 1000 undisclosed prompts. That's kinda weak sauce for a testing environment, and wholly insufficient to make such a statement. I doubt those results will scale up this flawlessly for models on which this feature would actually be useful.

In this post