Vision - a diwank Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

diwank 's Collections

F

search

Vision

Art

K

S1.1

Sam

Audio

thought

Vision

updated 7 days ago

apple/DepthPro

Depth Estimation • Updated Oct 9, 2024 • 2.21k • 375
rhymes-ai/Aria

Image-Text-to-Text • Updated 18 days ago • 19.8k • 602
mit-han-lab/hart-0.7b-1024px

Unconditional Image Generation • Updated Nov 17, 2024 • 9
deepseek-ai/Janus-1.3B

Any-to-Any • Updated Nov 14, 2024 • 8.41k • 491
neulab/PangeaInstruct

Updated Oct 25, 2024 • 402 • 78
genmo/mochi-1-preview

Text-to-Video • Updated 17 days ago • 43.2k • 1.13k
stabilityai/stable-diffusion-3.5-large

Text-to-Image • Updated Oct 22, 2024 • 129k • • 1.78k
Freepik/flux.1-lite-8B-alpha

Text-to-Image • Updated 5 days ago • 6.9k • 404
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 1.63k • 1.51k
mistralai/Pixtral-12B-Base-2409

Updated Oct 30, 2024 • 69
neulab/Pangea-7B

Updated Oct 24, 2024 • 5.68k • 122
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • Updated Oct 18, 2024 • 448 • 57
OpenGVLab/InternVL2-1B

Image-Text-to-Text • Updated 17 days ago • 57.2k • 59
OpenGVLab/InternVL2-2B

Image-Text-to-Text • Updated 17 days ago • 59.8k • 65
OpenGVLab/Mono-InternVL-2B

Image-Text-to-Text • Updated Nov 21, 2024 • 5.38k • 30
OpenGVLab/OmniCorpus-YT

Updated Nov 17, 2024 • 110 • 9
OpenGVLab/OmniCorpus-CC-210M

Viewer • Updated Nov 17, 2024 • 208M • 686 • 19
OpenGVLab/OmniCorpus-CC

Viewer • Updated Nov 17, 2024 • 986M • 8.03k • 12
OpenGVLab/InternVideo2_chat_8B_HD

Video-Text-to-Text • Updated 18 days ago • 824 • 17
OpenGVLab/ViCLIP

Updated Jun 7, 2024 • 33
OpenGVLab/ASMv2

Text Generation • Updated Feb 29, 2024 • 67 • 17
OpenGVLab/VideoChat2-IT

Viewer • Updated Jun 29, 2024 • 1.82M • 777 • 47
NimVideo/cogvideox-2b-img2vid

Image-to-Video • Updated Oct 28, 2024 • 379 • 62
BAAI/Infinity-MM

Updated 23 days ago • 24.8k • 86
nvidia/RADIO-H

Updated Dec 2, 2024 • 999 • 9
Spawning/PD12M

Viewer • Updated Nov 19, 2024 • 12.4M • 2.16k • 146
Shitao/OmniGen-v1

Text-to-Image • Updated Nov 7, 2024 • 9.67k • 274
InstantX/InstantIR

Image-to-Image • Updated Nov 7, 2024 • 6 • 160
nvidia/Cosmos-Tokenizer-DI8x8

Updated 11 days ago • 208 • 8
BAAI/Emu3-Chat

Text Generation • Updated Oct 24, 2024 • 1.42k • 71
briaai/RMBG-2.0

Image Segmentation • Updated 12 days ago • 256k • 553
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published Nov 11, 2024 • 20
rain1011/pyramid-flow-miniflux

Text-to-Video • Updated Nov 13, 2024 • 157
OpenGVLab/InternVL2-8B-MPO

Image-Text-to-Text • Updated 15 days ago • 1.89k • 34
mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated 9 days ago • 378
briaai/BRIA-2.3

Text-to-Image • Updated Nov 19, 2024 • 515 • 29
microsoft/Reducio-VAE

Updated Nov 21, 2024 • 12 • 15
Lightricks/LTX-Video

Image-to-Video • Updated 16 days ago • 84.3k • 814
apple/aimv2-3B-patch14-448

Image Feature Extraction • Updated Nov 28, 2024 • 318 • 8
THUdyh/Insight-V-Reason

Text Generation • Updated Nov 22, 2024 • 21 • 9
black-forest-labs/FLUX.1-Fill-dev

Updated Nov 25, 2024 • 47.7k • 433
Efficient-Large-Model/Sana_1600M_512px

Text-to-Image • Updated Dec 4, 2024 • 760 • 37
Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated Dec 4, 2024 • 15.9k • 150
AIDC-AI/Ovis1.6-Gemma2-27B

Image-Text-to-Text • Updated 25 days ago • 1.17k • 59
HuggingFaceTB/SmolVLM-Base

Image-Text-to-Text • Updated Nov 28, 2024 • 11.1k • 50
THUDM/glm-edge-v-5b

Image-Text-to-Text • Updated 2 days ago • 180 • 11
rhymes-ai/Aria-Base-64K

Image-Text-to-Text • Updated Dec 1, 2024 • 2.72k • 11
allenai/pixmo-point-explanations

Viewer • Updated 30 days ago • 79.6k • 323 • 6
tencent/HunyuanVideo

Text-to-Video • Updated 18 days ago • 9.84k • 1.36k
tencent/HunyuanVideo-PromptRewrite

Updated 30 days ago • 205 • 40
google/paligemma2-28b-pt-896

Image-Text-to-Text • Updated about 1 month ago • 1.15k • 40
OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 17 days ago • 5.01k • 158
MAmmoTH-VL/MAmmoTH-VL-8B

Updated 27 days ago • 199 • 14
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

Viewer • Updated 24 days ago • 37M • 6.33k • 33
OpenGVLab/PVC-InternVL2-8B

Image-Text-to-Text • Updated 19 days ago • 91 • 8
BGLab/BioTrove

Viewer • Updated 22 days ago • 163M • 599 • 7
TencentARC/NVComposer

Image-to-3D • Updated 19 days ago • 208 • 6
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated 17 days ago • 1.97k • 123
FastVideo/FastHunyuan

Text-to-Video • Updated 4 days ago • 538 • 134
BAAI/nova-d48w1536-sdxl1024

Text-to-Image • Updated 15 days ago • 44 • 7
IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated 10 days ago • 15.6k • 559
Infinigence/Megrez-3B-Omni

Updated 19 days ago • 656 • 120
microsoft/VidTok

Updated 11 days ago • 24
TIGER-Lab/Mantis-8B-siglip-llama3

Image-Text-to-Text • Updated Nov 15, 2024 • 7.39k • 32
OpenGVLab/HoVLE-HD

Image-Text-to-Text • Updated 11 days ago • 97 • 7
nyu-visionx/cambrian-34b

Text Generation • Updated Jun 28, 2024 • 44 • 28
nyu-visionx/cambrian-phi3-3b

Text Generation • Updated Jul 6, 2024 • 32 • 11
nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 1.66k • 32

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs