Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
5
4
10
Daniel De Leon
daniel-de-leon
Follow
tybrs's profile picture
lunarflu's profile picture
ashahba's profile picture
7 followers
ยท
5 following
daniel-de-leon-user293
AI & ML interests
None yet
Recent Activity
posted
an
update
2 months ago
As the rapid adoption of chat bots and QandA models continues, so do the concerns for their reliability and safety. In response to this, many state-of-the-art models are being tuned to act as Safety Guardrails to protect against malicious usage and avoid undesired, harmful output. I published a Hugging Face blog introducing a simple, proof-of-concept, RoBERTa-based LLM that my team and I finetuned to detect toxic prompt inputs into chat-style LLMs. The article explores some of the tradeoffs of fine-tuning larger decoder vs. smaller encoder models and asks the question if "simpler is better" in the arena of toxic prompt detection. ๐ to blog: https://huggingface.co/blog/daniel-de-leon/toxic-prompt-roberta ๐ to model: https://huggingface.co/Intel/toxic-prompt-roberta ๐ to OPEA microservice: https://github.com/opea-project/GenAIComps/tree/main/comps/guardrails/toxicity_detection A huge thank you to my colleagues that helped contribute: @qgao007, @mitalipo, @ashahba and Fahim Mohammad
View all activity
Articles
Occamโs Sheath: A Simpler Approach to AI Safety Guardrails
Oct 18
โข
8
Organizations
daniel-de-leon
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
Intel/toxic-prompt-roberta
3 months ago
Model card looks a bit messed up
1
#3 opened 3 months ago by
umarbutler
Update model card to adjust dimensions
1
#4 opened 3 months ago by
mitalipo
Add Model Card
#2 opened 3 months ago by
mitalipo
Initial model commit
1
#1 opened 3 months ago by
daniel-de-leon
New activity in
daniel-de-leon/test-docker
over 1 year ago
swap base container with DL Container
#1 opened over 1 year ago by
ashahba