CompVis Community

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

hjzheng authored a paper 5 days ago

Normalizing Flows are Capable Generative Models

penfever authored a paper 10 days ago

Hidden in the Noise: Two-Stage Robust Watermarking for Images

jychoi authored a paper 22 days ago

VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance

View all activity

compvis-community's activity

jimmyyhwu

authored 2 papers 4 days ago

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Paper • 2403.12945 • Published Mar 19

TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning

Paper • 2412.10447 • Published 10 days ago • 5

Humphrey

authored a paper 5 days ago

OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation

Paper • 2412.09585 • Published 9 days ago • 10

penfever

authored a paper 10 days ago

Hidden in the Noise: Two-Stage Robust Watermarking for Images

Paper • 2412.04653 • Published 16 days ago • 28

loubnabnl

posted an update 27 days ago

Post

1585

Making SmolLM2 reproducible: open-sourcing our training & evaluation toolkit 🛠️ https://github.com/huggingface/smollm/

- Pre-training code with nanotron
- Evaluation suite with lighteval
- Synthetic data generation using distilabel (powers our new SFT dataset HuggingFaceTB/smoltalk)
- Post-training scripts with TRL & the alignment handbook
- On-device tools with llama.cpp for summarization, rewriting & agents

Apache 2.0 licensed. V2 pre-training data mix coming soon!

Which other tools should we add next?

codegeasslbc

authored 4 papers about 1 month ago

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

Paper • 2101.04775 • Published Jan 12, 2021 • 1

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

Paper • 2404.05674 • Published Apr 8 • 13

Shifted Diffusion for Text-to-image Generation

Paper • 2211.15388 • Published Nov 24, 2022

Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models

Paper • 2409.10695 • Published Sep 16 • 2

lvwerra

authored a paper about 2 months ago

SelfCodeAlign: Self-Alignment for Code Generation

Paper • 2410.24198 • Published Oct 31 • 21

g-luo

authored 3 papers about 2 months ago

Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence

Paper • 2305.14334 • Published May 23, 2023 • 1

Readout Guidance: Learning Control from Diffusion Features

Paper • 2312.02150 • Published Dec 4, 2023 • 3

Task Vectors are Cross-Modal

Paper • 2410.22330 • Published Oct 29 • 11

penfever

authored 2 papers 2 months ago

Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

Paper • 2406.17720 • Published Jun 25 • 7

SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification

Paper • 2410.05057 • Published Oct 7 • 7

rinong

authored a paper 3 months ago

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published Oct 2 • 16

penfever

authored a paper 3 months ago

Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Paper • 2409.15268 • Published Sep 23 • 12

Humphrey

authored a paper 4 months ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28 • 83

rishistyping

authored a paper 4 months ago

pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

Paper • 2408.01556 • Published Aug 2 • 3

rinong

authored a paper 5 months ago

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Paper • 2408.00735 • Published Aug 1 • 16