Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
55
Mex Ivanov
MexIvanov
Follow
evilfreelancer's profile picture
21world's profile picture
2 followers
Ā·
8 following
MexIvanov
AI & ML interests
NLP, Coding, Quantum Computing and more.
Recent Activity
reacted
to
singhsidhukuldeep
's
post
with š„
12 days ago
Exciting News in AI: JinaAI Releases JINA-CLIP-v2! The team at Jina AI has just released a groundbreaking multilingual multimodal embedding model that's pushing the boundaries of text-image understanding. Here's why this is a big deal: š Technical Highlights: - Dual encoder architecture combining a 561M parameter Jina XLM-RoBERTa text encoder and a 304M parameter EVA02-L14 vision encoder - Supports 89 languages with 8,192 token context length - Processes images up to 512Ć512 pixels with 14Ć14 patch size - Implements FlashAttention2 for text and xFormers for vision processing - Uses Matryoshka Representation Learning for efficient vector storage ā”ļø Under The Hood: - Multi-stage training process with progressive resolution scaling (224ā384ā512) - Contrastive learning using InfoNCE loss in both directions - Trained on massive multilingual dataset including 400M English and 400M multilingual image-caption pairs - Incorporates specialized datasets for document understanding, scientific graphs, and infographics - Uses hard negative mining with 7 negatives per positive sample š Performance: - Outperforms previous models on visual document retrieval (52.65% nDCG@5) - Achieves 89.73% image-to-text and 79.09% text-to-image retrieval on CLIP benchmark - Strong multilingual performance across 30 languages - Maintains performance even with 75% dimension reduction (256D vs 1024D) šÆ Key Innovation: The model solves the long-standing challenge of unifying text-only and multi-modal retrieval systems while adding robust multilingual support. Perfect for building cross-lingual visual search systems! Kudos to the research team at Jina AI for this impressive advancement in multimodal AI!
reacted
to
singhsidhukuldeep
's
post
with š
13 days ago
Exciting breakthrough in AI: @Meta's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization! The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special: >> Key Innovations Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models. Three-Component Architecture: ā¢ Lightweight Local Encoder that converts bytes to patch representations ā¢ Powerful Global Latent Transformer that processes patches ā¢ Local Decoder that converts patches back to bytes >> Technical Advantages ā¢ Matches performance of Llama 3 at 8B parameters while being more efficient ā¢ Superior handling of non-English languages and rare character sequences ā¢ Remarkable 99.9% accuracy on spelling tasks ā¢ Better scaling properties than token-based models >> Under the Hood The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs. This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
liked
a model
18 days ago
CohereForAI/c4ai-command-r7b-12-2024
View all activity
Organizations
None yet
MexIvanov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a model
18 days ago
CohereForAI/c4ai-command-r7b-12-2024
Text Generation
ā¢
Updated
16 days ago
ā¢
28k
ā¢
326
liked
a model
about 1 month ago
jinaai/jina-embeddings-v3
Feature Extraction
ā¢
Updated
Dec 3, 2024
ā¢
841k
ā¢
635
liked
a dataset
about 1 month ago
wikimedia/wikipedia
Viewer
ā¢
Updated
Jan 9, 2024
ā¢
61.6M
ā¢
50.6k
ā¢
671
liked
a model
about 2 months ago
NexaAIDev/OmniVLM-968M
Updated
19 days ago
ā¢
2.5k
ā¢
490
liked
a dataset
6 months ago
HuggingFaceTB/smollm-corpus
Viewer
ā¢
Updated
Sep 6, 2024
ā¢
237M
ā¢
39.5k
ā¢
271
liked
a model
7 months ago
sentence-transformers/LaBSE
Sentence Similarity
ā¢
Updated
Mar 27, 2024
ā¢
767k
ā¢
237
liked
a dataset
7 months ago
sentence-transformers/trivia-qa-triplet
Viewer
ā¢
Updated
Jun 21, 2024
ā¢
52.9M
ā¢
185
ā¢
5
liked
2 models
7 months ago
mistralai/Mistral-7B-v0.3
Text Generation
ā¢
Updated
Jul 24, 2024
ā¢
3.63M
ā¢
412
openbmb/MiniCPM-Llama3-V-2_5
Image-Text-to-Text
ā¢
Updated
Sep 25, 2024
ā¢
28.9k
ā¢
1.39k
liked
2 models
9 months ago
urchade/gliner_large_bio-v0.1
Token Classification
ā¢
Updated
Apr 9, 2024
ā¢
114
ā¢
9
urchade/gliner_medium-v2.1
Token Classification
ā¢
Updated
Aug 21, 2024
ā¢
17.5k
ā¢
28
liked
9 models
10 months ago
urchade/gliner_large-v1
Updated
Apr 10, 2024
ā¢
1.23k
ā¢
4
urchade/gliner_medium-v2
Updated
Apr 10, 2024
ā¢
65
ā¢
5
urchade/gliner_large-v2
Token Classification
ā¢
Updated
Jul 12, 2024
ā¢
5.74k
ā¢
44
urchade/gliner_small-v1
Token Classification
ā¢
Updated
Apr 10, 2024
ā¢
656
ā¢
9
urchade/gliner_small-v2
Updated
Apr 10, 2024
ā¢
170
ā¢
6
urchade/gliner_medium-v1
Updated
May 7, 2024
ā¢
105
ā¢
5
urchade/gliner_multi
Token Classification
ā¢
Updated
Apr 10, 2024
ā¢
27.1k
ā¢
124
urchade/gliner_base
Token Classification
ā¢
Updated
Apr 10, 2024
ā¢
2.83k
ā¢
71
sambanovasystems/SambaLingo-Russian-Chat
Text Generation
ā¢
Updated
Apr 16, 2024
ā¢
214
ā¢
52
Load more