Weizmann Institute of Science

university

https://www.weizmann.ac.il/pages/

weizmannscience

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

talidekel authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

omerbartal authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

RafailFridman authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

View all activity

Articles

Universal Assisted Generation: Faster Decoding with Any Assistant Model

Oct 29, 2024

• 51

Faster Assisted Generation with Dynamic Speculation

Oct 8, 2024

• 45

weizmannscience's activity

talidekel

authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 29

omerbartal

authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 29

RafailFridman

authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 29

DanahY

authored a paper about 1 month ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 29

talidekel

authored a paper about 2 months ago

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Paper • 2501.12224 • Published Jan 21 • 46

talidekel

authored a paper 5 months ago

VidPanos: Generative Panoramic Videos from Casual Panning Videos

Paper • 2410.13832 • Published Oct 17, 2024 • 13

multimodalart

posted an update 8 months ago

Post

26262

New feature 🔥
Image models and LoRAs now have little previews 🤏

If you don't know where to start to find them, I invite you to browse cool LoRAs in the profile of some amazing fine-tuners: @artificialguybr , @alvdansen , @DoctorDiffusion , @e-n-v-y , @KappaNeuro @ostris

3 replies

multimodalart

posted an update 10 months ago

Post

28179

The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🤝 chinese understanding

Try it out by yourself here ▶️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!

multimodalart

posted an update about 1 year ago

Post

The Stable Diffusion 3 research paper broken down, including some overlooked details! 📝

Model
📏 2 base model variants mentioned: 2B and 8B sizes

📐 New architecture in all abstraction levels:
- 🔽 UNet; ⬆️ Multimodal Diffusion Transformer, bye cross attention 👋
- 🆕 Rectified flows for the diffusion process
- 🧩 Still a Latent Diffusion Model

📄 3 text-encoders: 2 CLIPs, one T5-XXL; plug-and-play: removing the larger one maintains competitiveness

🗃️ Dataset was deduplicated with SSCD which helped with memorization (no more details about the dataset tho)

Variants
🔁 A DPO fine-tuned model showed great improvement in prompt understanding and aesthetics
✏️ An Instruct Edit 2B model was trained, and learned how to do text-replacement

Results
✅ State of the art in automated evals for composition and prompt understanding
✅ Best win rate in human preference evaluation for prompt understanding, aesthetics and typography (missing some details on how many participants and the design of the experiment)

Paper: https://stabilityai-public-packages.s3.us-west-2.amazonaws.com/Stable+Diffusion+3+Paper.pdf

3 replies

multimodalart

posted an update about 1 year ago

Post

⚔️ The TIGERLab's Text2Image arena is here! ⚔️
TIGER-Lab/GenAI-Arena

Like https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard for LLMs: you prompt, two images emerge, vote for the best one 🏆

With enough votes this will lead to an Elo-based leaderboard for text-to-image models, go vote! 🗳️
TIGER-Lab/GenAI-Arena

multimodalart

posted an update about 1 year ago

Post

It seems February started with a fully open source AI renaissance 🌟

Models released with fully open dataset, training code, weights ✅

LLM - allenai/olmo-suite-65aeaae8fe5b6b2122b46778 🧠
Embedding - nomic-ai/nomic-embed-text-v1 📚 (sota!)

And it's literally February 1st - can't wait to see what else the community will bring 👀