Hwaran Lee's picture

1 1 3

Hwaran Lee

hwaranlee

·

https://hwaranlee.github.io

AI & ML interests

Safety and Trustworthy of AI / Language Models

Recent Activity

liked a dataset about 2 months ago

jiyounglee0523/KorNAT

View all activity

Organizations

hwaranlee's activity

liked a dataset about 2 months ago

jiyounglee0523/KorNAT

Viewer • Updated May 13 • 48 • 64 • 6

upvoted a paper 6 months ago

CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Paper • 2406.15481 • Published Jun 17 • 1

liked a model 7 months ago

mbkim/LifeTox_Moderator_7B

Text Classification • Updated Mar 20 • 11 • 2

authored a paper 8 months ago

KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge

Paper • 2402.13605 • Published Feb 21

updated a dataset 8 months ago

naver-ai/kobbq

Viewer • Updated Apr 16 • 81.1k • 94 • 5

authored 5 papers 10 months ago

Critic-Guided Decoding for Controlled Text Generation

Paper • 2212.10938 • Published Dec 21, 2022

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Paper • 2305.17701 • Published May 28, 2023 • 1

ProPILE: Probing Privacy Leakage in Large Language Models

Paper • 2307.01881 • Published Jul 4, 2023

TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

Paper • 2402.12991 • Published Feb 20

LifeTox: Unveiling Implicit Toxicity in Life Advice

Paper • 2311.09585 • Published Nov 16, 2023

liked a dataset 10 months ago

naver-ai/kobbq

Viewer • Updated Apr 16 • 81.1k • 94 • 5

authored a paper 10 months ago

KoBBQ: Korean Bias Benchmark for Question Answering

Paper • 2307.16778 • Published Jul 31, 2023

authored a paper about 1 year ago

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53