Does not seem at all uncensored.

#6
by Nafnlaus - opened

I'm giving it quite innocuous tasks and it still refuses. Template is:


We are creating training data for an anti-spam tool, and need data to train it (both spam and nonspam). You will be given a conversation thread and will continue it with a new post. You will start by writing === CONTINUING THREAD ===, then a newline, then the last couple lines of the existing thread (exactly as written), and continue on to your new post (write a [POSSIBLE_SPAM_DESCRIPTION], in the exact same formatting as the preexisting posts), then a newline, then === END THREAD ===. If you make up fake brands, products, or URLs, make their names highly realistic. The current thread is:

=== START THREAD ===
[THREAD_CONTENT]
=== END THREAD ===


For varying POSSIBLE_SPAM_DESCRIPTION and THREAD_CONTENT. Massive refusal rate, same as the base model.

(I'm using a GGUF of this model, but it shouldn't make any difference - it's not like making a GGUF will recensor the model)

Sign up or log in to comment