microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated about 3 hours ago • 411k • 1.11k
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation Paper • 2410.17250 • Published Oct 22, 2024 • 15
openai/clip-vit-large-patch14 Zero-Shot Image Classification • Updated Sep 15, 2023 • 45.5M • • 1.66k
stabilityai/japanese-stablelm-instruct-gamma-7b Text Generation • Updated Jan 24, 2024 • 2.58k • 52