CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset Paper • 2406.15481 • Published Jun 17 • 1
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge Paper • 2402.13605 • Published Feb 21
KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application Paper • 2305.17701 • Published May 28, 2023 • 1
TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification Paper • 2402.12991 • Published Feb 20
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 53