Alpha-VLLM

company

https://github.com/Alpha-VLLM

Alpha-VLLM

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

JackyZhuo authored a paper about 1 month ago

Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation

csuhan authored a paper about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

JackyZhuo authored a paper about 2 months ago

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

View all activity

Alpha-VLLM's activity

JackyZhuo

authored a paper about 1 month ago

Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation

Paper • 2412.09428 • Published Dec 12, 2024 • 7

csuhan

authored a paper about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 23

JackyZhuo

authored a paper about 2 months ago

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

Paper • 2411.14794 • Published Nov 22, 2024 • 13

stzhao

authored 3 papers 3 months ago

Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models

Paper • 2410.00363 • Published Oct 1, 2024 • 1

Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models

Paper • 2312.06685 • Published Dec 9, 2023 • 1

Boosting Open-Domain Continual Learning via Leveraging Intra-domain Category-aware Prototype

Paper • 2408.09984 • Published Aug 19, 2024 • 1

csuhan

authored a paper 3 months ago

Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant

Paper • 2410.13360 • Published Oct 17, 2024 • 8

Cxxs

authored a paper 3 months ago

I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow

Paper • 2410.07536 • Published Oct 10, 2024 • 5

stzhao

authored a paper 4 months ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24

JackyZhuo

authored 6 papers 5 months ago

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

Paper • 2406.18583 • Published Jun 5, 2024

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

Paper • 2405.05945 • Published May 9, 2024 • 2

Video Background Music Generation: Dataset, Method and Evaluation

Paper • 2211.11248 • Published Nov 21, 2022

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Paper • 2306.17103 • Published Jun 29, 2023 • 1

GraphText: Graph Reasoning in Text Space

Paper • 2310.01089 • Published Oct 2, 2023 • 2

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Paper • 2408.15881 • Published Aug 28, 2024 • 21

Cxxs

authored 4 papers 5 months ago

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

Paper • 2405.05945 • Published May 9, 2024 • 2

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

Paper • 2406.18583 • Published Jun 5, 2024

VEnhancer: Generative Space-Time Enhancement for Video Generation

Paper • 2407.07667 • Published Jul 10, 2024 • 14

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5, 2024 • 33

stzhao

authored a paper 5 months ago

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5, 2024 • 33

AI & ML interests

Recent Activity

Team members 15

Alpha-VLLM's activity