44 29 19

Pablo Montalvo PRO

Molbap

molbap

AI & ML interests

None yet

Recent Activity

updated a model 20 days ago

Molbap/molmo-hf-7B-D

updated a model about 1 month ago

Molbap/molmo-hf-72B

liked a model about 1 month ago

yonigozlan/GOT-OCR-2.0-hf

View all activity

Articles

Introducing TextImage Augmentation for Document Images

Aug 6

• 32

Organizations

Molbap's activity

upvoted 4 articles 5 months ago

Article

Introducing TextImage Augmentation for Document Images

Aug 6

• 32

Article

MobileNet Baselines

•

Jul 26

• 23

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25

• 18

Article

Mixture of Experts Explained

Dec 11, 2023

• 230

upvoted a paper 6 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10 • 68

upvoted 2 collections 6 months ago

Searching for Better ViT Baselines

Collection

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 25 items • Updated Aug 21 • 13

MobileNetV4 pretrained weights

Collection

Weights for MobileNet-V4 pretrained in timm • 17 items • Updated Sep 22 • 18

upvoted a paper 6 months ago

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17 • 20

upvoted 3 papers 7 months ago

upvoted 2 articles 7 months ago

Article

AI has a problem with objectifying women

•

May 24

• 55

Article

MobileNet-V4 (now in timm)

•

Jun 17

• 39

upvoted 2 articles 8 months ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

•

May 16

• 17

Article

License to Call: Introducing Transformers Agents 2.0

May 13

• 119

upvoted a collection 8 months ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 16 days ago • 142

upvoted 2 articles 8 months ago

Article

2024-04-22 - Hub Incident Post Mortem

•

May 17

• 17

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 227

upvoted a paper 9 months ago

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9 • 29

upvoted a paper 10 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 604