1 1 55

Mex Ivanov

MexIvanov

MexIvanov

AI & ML interests

NLP, Coding, Quantum Computing and more.

Recent Activity

reacted to singhsidhukuldeep's post with 🔥 12 days ago

Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: 🚀 Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512×512 pixels with 14×14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage ⚡️ Under The Hood: - Multi-stage training process with progressive resolution scaling (224→384→512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample 📊 Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) 🎯 Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!

reacted to singhsidhukuldeep's post with 🚀 13 days ago

Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: • Lightweight Local Encoder that converts bytes to patch representations • Powerful Global Latent Transformer that processes patches • Local Decoder that converts patches back to bytes >> Technical Advantages • Matches performance of Llama 3 at 8B parameters while being more efficient • Superior handling of non-English languages and rare character sequences • Remarkable 99.9% accuracy on spelling tasks • Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

liked a model 18 days ago

CohereForAI/c4ai-command-r7b-12-2024

View all activity

Organizations

None yet

MexIvanov's activity

liked a model 18 days ago

CohereForAI/c4ai-command-r7b-12-2024

Text Generation • Updated 16 days ago • 28k • 326

liked a model about 1 month ago

jinaai/jina-embeddings-v3

Feature Extraction • Updated Dec 3, 2024 • 841k • 635

liked a dataset about 1 month ago

wikimedia/wikipedia

Viewer • Updated Jan 9, 2024 • 61.6M • 50.6k • 671

liked a model about 2 months ago

NexaAIDev/OmniVLM-968M

Updated 19 days ago • 2.5k • 490

liked a dataset 6 months ago

HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 39.5k • 271

liked a model 7 months ago

sentence-transformers/LaBSE

liked a dataset 7 months ago

sentence-transformers/trivia-qa-triplet

Viewer • Updated Jun 21, 2024 • 52.9M • 185 • 5

liked 2 models 7 months ago

mistralai/Mistral-7B-v0.3

Text Generation • Updated Jul 24, 2024 • 3.63M • 412

openbmb/MiniCPM-Llama3-V-2_5

Image-Text-to-Text • Updated Sep 25, 2024 • 28.9k • 1.39k

liked 2 models 9 months ago

urchade/gliner_large_bio-v0.1

Token Classification • Updated Apr 9, 2024 • 114 • 9

urchade/gliner_medium-v2.1

Token Classification • Updated Aug 21, 2024 • 17.5k • 28

liked 9 models 10 months ago