Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives Paper • 2310.01152 • Published Oct 2, 2023
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models Paper • 2402.01118 • Published Feb 2, 2024 • 31
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey Paper • 2409.18169 • Published Sep 26, 2024
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack Paper • 2402.01109 • Published Feb 2, 2024
Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack Paper • 2405.18641 • Published May 28, 2024
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Paper • 2408.09600 • Published Aug 18, 2024
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation Paper • 2409.01586 • Published Sep 3, 2024
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation Paper • 2501.17433 • Published Jan 29 • 9