GRAPE: Generalizing Robot Policy via Preference Alignment Paper • 2411.19309 • Published 24 days ago • 42
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents Paper • 1911.02557 • Published Nov 6, 2019
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning Paper • 2204.10815 • Published Apr 22, 2022
Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI Paper • 2205.00029 • Published Apr 29, 2022
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Paper • 2410.22304 • Published Oct 29 • 16
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning Paper • 2303.03323 • Published Mar 6, 2023 • 1
Unsupervised Learning of Neural Networks to Explain Neural Networks Paper • 1805.07468 • Published May 18, 2018
Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality Paper • 2310.06982 • Published Oct 10, 2023
Robust Learning with Progressive Data Expansion Against Spurious Correlation Paper • 2306.04949 • Published Jun 8, 2023
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models Paper • 2403.07384 • Published Mar 12 • 1
AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies Paper • 2407.17436 • Published Jul 11
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI Paper • 2410.11096 • Published Oct 14 • 12
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14 • 51
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17 • 48
Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards Paper • 2310.03379 • Published Oct 5, 2023
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5 • 52