5 5 22

Raphael Liu

RaphaelLiu

Yaofang-Liu

AI & ML interests

Recent Activity

new activity 10 days ago

ArtificialAnalysis/Video-Generation-Arena-Leaderboard:When include new models like SORA and HunYuan?

upvoted a paper 20 days ago

ROICtrl: Boosting Instance Control for Visual Generation

liked a model about 2 months ago

Etched/oasis-500m

View all activity

Organizations

None yet

RaphaelLiu's activity

New activity in ArtificialAnalysis/Video-Generation-Arena-Leaderboard 10 days ago

When include new models like SORA and HunYuan?

#2 opened 11 days ago by

RaphaelLiu

upvoted a paper 20 days ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published 26 days ago • 82

liked a model about 2 months ago

Etched/oasis-500m

Updated Nov 4 • 464 • 431

liked a dataset about 2 months ago

Koala-36M/Koala-36M-v1

Viewer • Updated Oct 12 • 36M • 511 • 22

reacted to singhsidhukuldeep's post with ❤️ about 2 months ago

Post

2838

If you have ~300+ GB of V-RAM, you can run Mochi from @genmo

A SOTA model that dramatically closes the gap between closed and open video generation models.

Mochi 1 introduces revolutionary architecture featuring joint reasoning over 44,520 video tokens with full 3D attention. The model implements extended learnable rotary positional embeddings (RoPE) in three dimensions, with network-learned mixing frequencies for space and time axes.

The model incorporates cutting-edge improvements, including:
- SwiGLU feedforward layers
- Query-key normalization for enhanced stability
- Sandwich normalization for controlled internal activations

What is currently available?
The base model delivers impressive 480p video generation with exceptional motion quality and prompt adherence. Released under the Apache 2.0 license, it's freely available for both personal and commercial applications.

What's Coming?
Genmo has announced Mochi 1 HD, scheduled for release later this year, which will feature:
- Enhanced 720p resolution
- Improved motion fidelity
- Better handling of complex scene warping