Collections
Discover the best community collections!
Collections including paper arxiv:2404.19756
-
Iterative Reasoning Preference Optimization
Paper • 2404.19733 • Published • 44 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 65 -
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 59 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 102
-
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 62 -
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Paper • 2404.07839 • Published • 40 -
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Paper • 2404.05892 • Published • 28 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 132
-
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 24 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 48 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 29 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 19
-
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 78 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 102 -
Understanding the performance gap between online and offline alignment algorithms
Paper • 2405.08448 • Published • 11 -
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Paper • 2405.17428 • Published • 14
-
MLP Can Be A Good Transformer Learner
Paper • 2404.05657 • Published • 1 -
Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective
Paper • 2404.07200 • Published • 1 -
An inclusive review on deep learning techniques and their scope in handwriting recognition
Paper • 2404.08011 • Published • 1 -
Long-form music generation with latent diffusion
Paper • 2404.10301 • Published • 23
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 41 -
Locating and Editing Factual Associations in Mamba
Paper • 2404.03646 • Published • 3 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 102
-
One-step Diffusion with Distribution Matching Distillation
Paper • 2311.18828 • Published • 2 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 75 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 11 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1