117 96 789

Yasunori Ozaki PRO

alfredplpl

https://alfredplpl.github.io/en/index.html

AI & ML interests

Computer Vision, LLM

Recent Activity

new activity 5 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:Support For Japanese Model

liked a model 5 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

upvoted a paper 5 days ago

Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions

View all activity

Organizations

alfredplpl's activity

upvoted a paper 5 days ago

Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions

Paper • 2501.10020 • Published 9 days ago • 21

upvoted a paper 12 days ago

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published 18 days ago • 50

upvoted a paper about 1 month ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

upvoted a paper 2 months ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 113

upvoted a paper 3 months ago

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Paper • 2411.02397 • Published Nov 4, 2024 • 23

upvoted 4 articles 3 months ago

Article

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Oct 22, 2024

• 50

Article

Allegro: Advanced Video Generation Model

•

Oct 22, 2024

• 59

Article

Advanced Flux Dreambooth LoRA Training with 🧨 diffusers

•

Oct 21, 2024

• 33

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 48

upvoted a collection 4 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 564

upvoted an article 4 months ago

Article

FineVideo: behind the scenes

Sep 23, 2024

• 28

upvoted a collection 4 months ago

CommonCanvas

Collection

Collection of models trained on the CommonCatalogue datasets • 8 items • Updated May 16, 2024 • 9

upvoted 2 papers 4 months ago

LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12960 • Published Sep 19, 2024 • 24

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

upvoted 2 papers 5 months ago

OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

Paper • 2408.17131 • Published Aug 30, 2024 • 12

upvoted a collection 5 months ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated 18 days ago • 548

upvoted 3 papers 5 months ago

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Paper • 2408.15239 • Published Aug 27, 2024 • 29

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 123

xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Paper • 2408.12590 • Published Aug 22, 2024 • 35