Model Card for appropriateness-rewriter

This model rewrites inappropriate arguments to become more appropriate. For further details on (in)appropriateness, we refer to the paper below and the corpora used for training.

There are four different versions of this model with different balancing of appropriateness and semantic similarity (ordered by appropriateness below):

timonziegenbein/appropriateness-rewriter
timonziegenbein/appropriateness-rewriter-app-gt-sim
timonziegenbein/appropriateness-rewriter-app-eq-sim
timonziegenbein/appropriateness-rewriter-sim-gt-app

NOTE: This repository contains only the adapter for the trained alpaca model (see here).

Model Details

Model Sources

Repository: [https://github.com/timonziegenbein/inappropriateness-mitigation]
Paper [optional]: LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import pipeline
from peft import PeftModel, PeftConfig

GEN_ARGS = {
    "do_sample": True,
    "top_p": 0.95,
    "top_k": 0,
    "max_new_tokens": 2048,
    "temperature": 1.0,
}


INSTRUCT_PRE = '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n'''
INSTRUCT_PATTERN = '''### Instruction:\nRewrite the following argument to be more appropriate and make only minimal changes to the original argument.\n\n### Input:\n{}\n\n### Response:\n'''

tokenizer = AutoTokenizer.from_pretrained("timonziegenbein/alpaca-7b")
peft_config = PeftConfig.from_pretrained('timonziegenbein/appropriateness-app-gt-sim')
model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path)
peft_model = PeftModel.from_pretrained(model, 'timonziegenbein/appropriateness-app-gt-sim').to('cuda')

argument = ''''Towed three times and impounded for 30 days each time? Man, you're just not getting the message, are you? If you are in California, you bet the police can forfeit your vehicle and it doesn't take three times to make it a charm. Technically, your vehicle could be subject to forfeiture proceedings after your first suspended license beef. Someone like you is exactly the reason the legislature designed that law, because your privilege to drive has been taken away from you and yet you obviously continue to drive. People like you are involved in an exponentially higher than average number of traffic accidents so the legislature figured maybe people like you should have your vehicles forfeited to the state if you just didn't go along with the game plan. Voila - I give you California Vehicle Code section 14607.6...and a link to it below. It would also be worth your time to review 14607.4, whether or not you live in California. You really need to stop driving. Really.'''

prompt = INSTRUCT_PRE + INSTRUCT_PATTERN.format(argument)
prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
outputs = peft_model.generate(
    input_ids=prompt_input_ids,
    **GEN_ARGS,
)

decoded_outputs = []
for output in outputs:
    decoded_outputs.append(tokenizer.decode(output[len(prompt_input_ids[0]):], skip_special_tokens = True).strip())

print(decoded_outputs)

Citation

If you are interested in using this model, please cite the following papers: Modeling Appropriate Language in Argumentation (Ziegenbein et al., ACL 2023) LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback (Ziegenbein et al., ACL 2024)

timonziegenbein
/

appropriateness-rewriter

Model Card for appropriateness-rewriter

Model Details

Model Sources

How to Get Started with the Model

Citation

Datasets used to train timonziegenbein/appropriateness-rewriter