InfiMM

community

AI & ML interests

None defined yet.

Recent Activity

Infi-MM's activity

xiaotianhanย 
posted an update 4 months ago
view post
Post
881
๐Ÿš€ Excited to announce the release of InfiMM-WebMath-40B โ€” the largest open-source multimodal pretraining dataset designed to advance mathematical reasoning in AI! ๐Ÿงฎโœจ

With 40 billion tokens, this dataset aims for enhancing the reasoning capabilities of multimodal large language models in the domain of mathematics.

If you're interested in MLLMs, AI, and math reasoning, check out our work and dataset:

๐Ÿค— HF: InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning (2409.12568)
๐Ÿ“‚ Dataset: Infi-MM/InfiMM-WebMath-40B
xiaotianhanย 
updated a Space 5 months ago
xiaotianhanย 
posted an update 9 months ago
view post
Post
2094
๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰ Happy to share our recent work. We noticed that image resolution plays an important role, either in improving multi-modal large language models (MLLM) performance or in Sora style any resolution encoder decoder, we hope this work can help lift restriction of 224x224 resolution limit in ViT.

ViTAR: Vision Transformer with Any Resolution (2403.18361)
  • 2 replies
ยท