Open-source AI creates healthy competition in a field where natural tendencies lead to extreme concentration of power. Imagine a world where only one or two companies could build software. This is the biggest risk and ethical challenge of them all IMO. Let's fight this!
Remember when @Google launched MediaPipe in an effort to create efficient on-device pipelines?
They've just unlocked the ability to run 7B+ parameter language models directly in your browser. This is a game-changer for on-device AI!
Yes, they are streaming 8.6 GB model files!
Currently, they have Gemma 2B/7B running, but imagine Dynamic LoRA, multimodal support, quantization, and you never leaving Chrome!
This is a significant technical advancement, especially in Memory Optimization:
- Redesigned the model-loading code to work around WebAssembly's 4 GB memory limit. - Implemented asynchronous loading of transformer stack layers (28 for Gemma 1.1 7B). - Reduced peak WebAssembly memory usage to less than 1% of previous requirements.
Cross-Platform Compatibility - Compiled the C++ codebase to WebAssembly for broad browser support. - Utilized the WebGPU API for native GPU acceleration in browsers.
Here's why this matters:
1. Privacy: No need to send data to remote servers. 2. Cost-Efficiency: Eliminates server expenses. 3. Offline Capabilities: Use powerful AI without an internet connection.
๐ฏ Targeted training with Spectrum I used Spectrum, a relatively new technique for parameter-efficient learning. The idea is to train only the layers of the model with high Signal-to-Noise Ratio (SNR) and โ๏ธ freeze the rest. I trained the top 30% of model layers.