Yu Yang's picture

3 8 2

Yu Yang

yuyangy

·

https://sites.google.com/g.ucla.edu/yuyang/home

AI & ML interests

None yet

Recent Activity

authored a paper 9 days ago

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

authored a paper 9 days ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

upvoted a paper 10 days ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

View all activity

Organizations

yuyangy's activity

authored 2 papers 9 days ago

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Paper • 2406.17864 • Published Jun 25, 2024

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published 12 days ago • 20

upvoted a paper 10 days ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published 12 days ago • 20

upvoted 2 papers 4 months ago

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 17

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

Paper • 2403.07384 • Published Mar 12, 2024 • 1

updated a collection 4 months ago

S2L

4 items • Updated Oct 22, 2024

authored 8 papers 4 months ago

CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning

Paper • 2303.03323 • Published Mar 6, 2023 • 1

Interpreting CNNs via Decision Trees

Paper • 1802.00121 • Published Feb 1, 2018

Unsupervised Learning of Neural Networks to Explain Neural Networks

Paper • 1805.07468 • Published May 18, 2018

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Paper • 2310.06982 • Published Oct 10, 2023

Robust Learning with Progressive Data Expansion Against Spurious Correlation

Paper • 2306.04949 • Published Jun 8, 2023

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

Paper • 2403.07384 • Published Mar 12, 2024 • 1

AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies

Paper • 2407.17436 • Published Jul 11, 2024

SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI

Paper • 2410.11096 • Published Oct 14, 2024 • 13

New activity in Virtue-AI-HUB/SecCodePLT 4 months ago

Update README.md

#2 opened 4 months ago by

updated a dataset 4 months ago

Virtue-AI-HUB/SecCodePLT

Viewer • Updated Oct 16, 2024 • 1.35k • 234 • 4