Juan Balarini's picture

Juan Balarini PRO

jpbalarini

·

AI & ML interests

None yet

Recent Activity

liked a model about 12 hours ago

cognitivecomputations/Dolphin3.0-Llama3.1-8B

upvoted a collection 13 days ago

QVQ-72B-Preview

liked a Space 24 days ago

franciszzj/Leffa

View all activity

Organizations

None yet

jpbalarini's activity

upvoted a collection 13 days ago

QVQ-72B-Preview

5 items • Updated 13 days ago • 6

upvoted an article 2 months ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5, 2024

• 184

upvoted 2 collections 2 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 15 days ago • 197

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101

upvoted a collection 3 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated Nov 27, 2024 • 291

upvoted 2 collections 4 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 186

Sapiens

Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated Sep 18, 2024 • 50

upvoted a collection 5 months ago

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Nov 14, 2024 • 542

upvoted a collection 6 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 15 days ago • 208

upvoted an article 6 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

• 181

upvoted 4 collections 7 months ago

Florence

9 items • Updated Jul 11, 2024 • 162

DeepSeekCoder-V2

6 items • Updated Sep 5, 2024 • 82

4M Models

Multimodal models from https://4m.epfl.ch/ • 14 items • Updated Jun 14, 2024 • 31

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 20 hours ago • 161

upvoted 2 papers 10 months ago

SaulLM-7B: A pioneering Large Language Model for Law

Paper • 2403.03883 • Published Mar 6, 2024 • 77

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 605

upvoted a collection 11 months ago

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 205