Training a model to reason in the continuous latent space based on Meta's Coconut. If it all works will apply it on the MiniCPM-o SVD-LR. Endgame is a multimodal, adaptive, and efficient foundational on device AI model.
Can it run DeepSeek V3 671B is the new 'can it run Doom'.
How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.
Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.