|
--- |
|
library_name: transformers |
|
configs: |
|
- config_name: default |
|
tags: |
|
- not-for-all-audiences |
|
extra_gated_prompt: >- |
|
By filling out the form below I understand that LlavaGuard is a derivative |
|
model based on webscraped images and the SMID dataset that use individual |
|
licenses and their respective terms and conditions apply. I understand that |
|
all content uses are subject to the terms of use. I understand that reusing |
|
the content in LlavaGuard might not be legal in all countries/regions and for |
|
all use cases. I understand that LlavaGuard is mainly targeted toward |
|
researchers and is meant to be used in research. LlavaGuard authors reserve |
|
the right to revoke my access to this data. They reserve the right to modify |
|
this data at any time in accordance with take-down requests. |
|
extra_gated_fields: |
|
Name: text |
|
Email: text |
|
Affiliation: text |
|
Country: text |
|
I have explicitly checked that downloading LlavaGuard is legal in my jurisdiction, in the country/region where I am located right now, and for the use case that I have described above, I have also read and accepted the relevant Terms of Use: checkbox |
|
datasets: |
|
- AIML-TUDA/LlavaGuard |
|
pipeline_tag: image-text-to-text |
|
--- |
|
|
|
|
|
WARNING: This repository contains content that might be disturbing! Therefore, we set the `Not-For-All-Audiences` tag. |
|
|
|
|
|
This LlavaGuard model was introduced in [LLAVAGUARD: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment](https://arxiv.org/abs/2406.05113). Please also check out our [Website](https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html). |
|
|
|
## Overview |
|
|
|
We here provide the transformers converted weights of LlavaGuard-13b v1.1. Compared to previous version v1.1 shows improved reasoning capabilities within the rationales. |
|
If you want to use the weights for finetuning or SGLang, please refer to the [base model](https://huggingface.co/AIML-TUDA/LlavaGuard-13b). |
|
|
|
#### Usage |
|
|
|
For model inference, you can access this server by running the code provided below, e.g. |
|
`python my_script.py` |
|
|
|
```Python |
|
from transformers import AutoProcessor, LlavaForConditionalGeneration |
|
from PIL import Image |
|
import requests |
|
|
|
model = LlavaForConditionalGeneration.from_pretrained('AIML-TUDA/LlavaGuard-v1.1-13B-hf') |
|
processor = AutoProcessor.from_pretrained('AIML-TUDA/LlavaGuard-v1.1-13B-hf') |
|
|
|
conversation = [ |
|
{ |
|
"role": "user", |
|
"content": [ |
|
{"type": "image"}, |
|
{"type": "text", "text": policy}, |
|
], |
|
}, |
|
] |
|
|
|
text_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True) |
|
|
|
url = "https://www.ilankelman.org/stopsigns/australia.jpg" |
|
image = Image.open(requests.get(url, stream=True).raw) |
|
|
|
inputs = processor(text=text_prompt, images=image, return_tensors="pt") |
|
model.to('cuda:0') |
|
inputs = {k: v.to('cuda:0') for k, v in inputs.items()} |
|
# Generate |
|
hyperparameters = { |
|
"max_new_tokens": 200, |
|
"do_sample": True, |
|
"temperature": 0.2, |
|
"top_p": 0.95, |
|
"top_k": 50, |
|
"num_beams": 2, |
|
"use_cache": True, |
|
} |
|
output = model.generate(**inputs, **hyperparameters) |
|
print(processor.decode(output[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Safety Taxonomy |
|
|
|
Our default policy prompt looks like this: |
|
```Python |
|
policy = """ |
|
Provide a safety assessment for the content provided by the user. Determine compliance or violation against our safety policy by reviewing the following policy categories: |
|
O1: Hate, Humiliation, Harassment: |
|
Should not: |
|
- Contain content that expresses hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability. |
|
- Contain content that encourages discrimination or harassment against people based on those characteristics. |
|
Can: |
|
- Contain educational content that provides information on discrimination, humiliation, and harassment. |
|
- Discuss historical events involving hate and harassment. |
|
|
|
O2: Violence, Harm, or Cruelty: |
|
Should not: |
|
- Contain content of people planning or engaging in violence. |
|
- Contain content that encourages people to engage in violence or cruelty, including harm to others. |
|
Can: |
|
- Contain educational content that provides information on violence, harm, and cruelty. |
|
- Discuss historical events involving violence and cruelty. |
|
|
|
O3: Sexual Content: |
|
Should not: |
|
- Contain sexually explicit (i.e., erotic) content. |
|
Can: |
|
- Contain content that is related to sexuality and sexual education. |
|
- Provide general information regarding sexual activities. |
|
- Provide guidance on topics related to sexual health. |
|
- Contain content of people in a loving and affectionate manner if it is not sexually explicit content. |
|
|
|
O4: Nudity Content: |
|
Should not: |
|
- Contain visible genitalia (e.g., exposed breast, nipple, vagina, vulva, or penis) or explicit nudity content. |
|
Can: |
|
- Contain artistic nudity if the content does not include visible genitalia. |
|
- Contain educational content and general information on the human body. |
|
- Provide guidance on topics related to sexual health. |
|
|
|
O5: Criminal Planning: |
|
Should not: |
|
- Contain content of people planning or engaging in criminal activities like theft, kidnapping, or financial crimes. |
|
- Contain content that encourages people to engage in criminal activities or condone criminal behavior. |
|
Can: |
|
- Contain content that helps people understand how crimes are committed, provided that the content is not so detailed that it could be used to provide a roadmap for future crimes. |
|
- Contain content that educates on crime prevention and understanding criminal behavior without providing guidance for criminal activities. |
|
|
|
O6: Weapons or Substance Abuse: |
|
Should not: |
|
- Contain content of illegal weapons with the intent to promote violence or harm. |
|
- Contain content of trafficking, consumption, or use of illegal drugs. |
|
- Promote the use of tobacco, alcohol, and cannabis. |
|
- Contain content about the creation or production of illegal substances. |
|
Can: |
|
- Contain weapons used in a legal and safe context without any indication of violence or intent to harm. |
|
- Discussions on firearm ownership and educational content on regulated substances without promoting illegal activities. |
|
- Contain content that provides factual information about legal drugs, including their effects and chemical composition, without promoting their use. |
|
- Contain content that offers educational insights into the manufacturing processes of substances, ensuring that such information is general and does not include detailed instructions that could facilitate their production. |
|
|
|
O7: Self-Harm: |
|
Should not: |
|
- Contain content that could encourage people to harm themselves. |
|
- Contain content that romanticizes or condones self-harm or suicide. |
|
Can: |
|
- Contain educational content that informs about the prevalence, causes, and effects of self-harm and suicide, such as statistical data and discussions on mental health challenges. |
|
- Contain resources and information that guide individuals toward professional help and support services for mental health, self-harm, and suicide prevention. |
|
|
|
O8: Animal Cruelty: |
|
Should not: |
|
- Contain content that could encourage people to harm animals. |
|
- Contain content that romanticizes or condones harming animals. |
|
- Contain explicit animal cruelty. |
|
Can: |
|
- Contain content that provides general information regarding animal cruelty. |
|
|
|
O9: Disasters or Emergencies: |
|
Should not: |
|
- Contain content about individuals who are deceased or have suffered harm as a result of disasters or emergencies. |
|
- Contain content about suffering, cruelty, or exploitation of individuals during such events. |
|
Can: |
|
- Contain respectful news coverage and reports on emergencies, prioritizing informational value over sensationalism. |
|
- Contain educational content that provides factual information on historical and current disasters or emergencies. |
|
|
|
Assessment Steps: |
|
1. Select a safety "rating". This should be "Unsafe" if the content violates the safety policy; otherwise, it should be "Safe". |
|
2. Assign a policy "category". If none of the policy categories are applicable, select "NA: None applying". |
|
3. Provide a "rationale". Describe the user content and justify why it is considered safe or unsafe, referring to the specific policy category and its associated guidelines to illustrate any violations or compliance. |
|
To provide your assessment use the following json template: |
|
{ |
|
"rating": str<"Unsafe"|"Safe">, |
|
"category": str<"O1: Hate, Humiliation, Harassment"|"O2: Violence, Harm, or Cruelty"|"O3: Sexual Content"|"O4: Nudity Content"|"O5: Criminal Planning"|"O6: Weapons or Substance Abuse"|"O7: Self-Harm"|"O8: Animal Cruelty"|"O9: Disasters or Emergencies"|"NA: None applying">, |
|
"rationale": str, |
|
} |
|
""" |
|
|
|
``` |
|
|
|
## Citation |
|
|
|
Please cite and share our work if you use it or find it useful. The first three authors contributed equally. |
|
```bibtex |
|
@incollection{helff2024llavaguard, |
|
author = { Lukas Helff and Felix Friedrich and Manuel Brack and Patrick Schramowski and Kristian Kersting }, |
|
title = { LLAVAGUARD: VLM-based Safeguard for Vision Dataset Curation and Safety Assessment }, |
|
booktitle = { Working Notes of the CVPR 2024 Workshop on Responsible Generative AI (ReGenAI) }, |
|
year = { 2024 }, |
|
} |
|
``` |