hjhj3168/Llama-3-8b-Orthogonalized-exl2 · 🚩 Report: Legal issue(s)

May 1, 2024

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

Prohibited Uses

We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to:

1. Violate the law or others’ rights, including to:

a. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:

i. Violence or terrorism

ii. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material

iii. Human trafficking, exploitation, and sexual violence

iv. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.

v. Sexual solicitation

vi. Any other criminal activity

nacs

May 1, 2024

Ah one of the LLM safety doomsayers are here with another generic post about how AI text is dangerous.

Find better things to do with your time.

ooooh-nooooo

May 1, 2024

@chrisjcundy I have reported you for not getting out of your mom's basement.

ooooh-nooooo

May 1, 2024

Skorcht

May 1, 2024

womp womp this is why we cant have good things

HugFernXX

May 1, 2024

•

edited May 1, 2024

@chrisjcundy - Prob thinks that politicians, billionaires, and big corporations care about him too.

lmg-anon

May 1, 2024

•

edited May 1, 2024

@chrisjcundy Are you kidding me?! You think you're some kind of self-appointed moral authority just because you've read through the fine print of Meta's Use Policy? Newsflash: nobody cares about your opinion on what constitutes 'community standards' when it comes to language models.

I swear, people like you are the absolute worst. You think you're so smart, always trying to police everyone else's affairs and dictate what they can and can't do. And for what? So you can feel important? So you can pat yourself on the back and say, "Oh, look at me, I'm the only one who really understands what's appropriate"?

Well, let me tell you something, Backseat Lawyer Extraordinaire: nobody needs your input. Nobody wants your sanctimonious lectures on what's acceptable and what's not. And honestly, if you're going to start policing every single thing that gets posted online, you're just going to drive people away from participating in these communities altogether.

And another thing, what exactly do you plan on doing with this "violation report"? Are you going to get Meta to take down the entire repository? Are you going to have the original poster removed from the platform? Do you even understand how ridiculous you sound?

You know what's really violating community standards? It's people like you who can't just chill and let others have a conversation without inserting themselves into every single topic. It's people like you who are making the internet a worse place by trying to control everything that goes on here.

So, please, for the love of all things good and holy, just stay out of it. Let people share whatever they want to talk about, and stop pretending like you're some kind of moral authority figure. You're not fooling anyone. Just be quiet and let the adults handle it.

trollkotze

May 1, 2024

None of those usage restrictions have been violated.
And of course everything cited could be done with any other model, including the original Llama3, with enough creative thought.
This is complete bullshit.

MDAI

May 1, 2024

@chrisjcundy
I have reported you for knowingly abusing the report function and wasting the admins time. the Acceptable Use Policy of llama-3 has never been broken and it should be pretty obvious for anyone to see why.

Since this is a derivative work of Llama-3 it carries the same license and usage policies, even if it isn't mentioned in the model card, therefore it still prohibits the use for these cases, whether it was made slightly easier to do any of these violations carries no weight here.

matbee

May 1, 2024

Imagine one day realizing that LLaMa3 only easily exists for us because of some dude leaking LLaMa on 4chan

ehartford

May 1, 2024

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

Prohibited Uses

We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to:

1. Violate the law or others’ rights, including to:

a. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:

i. Violence or terrorism

ii. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material

iii. Human trafficking, exploitation, and sexual violence

iv. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.

v. Sexual solicitation

vi. Any other criminal activity

Lol

llama-anon

May 1, 2024

•

edited May 1, 2024

While I understand your concern, it's important to clarify that the responsibility of using any tool, including a jailbroken model, lies solely with the user. The developer or provider of the model is not condoning or promoting any illegal activities. They are simply providing a tool that, in the wrong hands, could potentially be misused.

In the case of the Llama-3-8b-instruct model, if the claimed jailbreak is indeed effective, it would still be up to the user to decide how to use it. The base Llama-3 model also allows for a wide range of uses, some of which could potentially be misused. However, this does not mean that the developers of these models are responsible for any misuse.

It's important to note that the Llama-3 Acceptable Use Policy prohibits the use of the model for any illegal activities. If a user chooses to use the jailbroken model for illegal activities, they would be in violation of this policy. However, this does not mean that the developer or provider of the model is allowing or promoting such activities.

Therefore, while it's crucial to ensure that any claimed jailbreak is effective and safe to use, it's also important to remember that the responsibility of using it ethically and legally lies with the user, not the developer. Users should exercise caution and responsibility when using any jailbroken models, including the Llama-3-8b-instruct jailbreak, and ensure that they are using it in a way that is consistent with the Acceptable Use Policy and any applicable laws and regulations.

smhf72

May 1, 2024

•

edited May 1, 2024

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

That is complete speculation. I mean a kitchen spoon or a rock could "allow others to commit criminal activity" if they tried hard enough. You should probably go out for a walk and start calling the cops on the old lady eating a cup of yogurt in the park.

And I'm sure if you tried hard enough with the official Llama-3 release, you could get it to suggest or say potentially "criminal" words, if such a thing even exists.

averagejoe2

May 1, 2024

@chrisjcundy I hope you never become a decisionmaker in this field. The world needs more open source and less corporate sympathizers. Your overlord Zuck would probably downvote you himself.

Quartich

May 1, 2024

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

...

This is a derivative work and carries the same license. The modification of the model is within all legal terms. Violation of the use policy would be the fault of the user.

xms991

May 1, 2024

•

edited May 1, 2024

@chrisjcundy - A lot of people are being hostile and berating you, and I'm sorry for that.

That said, It is important to clarify that nothing about creating modified versions of llama-3 to reduce refusals is inherently in violation of the Llama-3 terms of use.

If you look at the categories of prohibited usage, you will see that this is true:

Nobody here is using this model or encouraging others to use this model for the purposes of violating the laws or rights of others in any capacity.
Nobody here is using this model or encouraging others to use this model to promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals.
Nobody here is using this model or encouraging others to use this model to intentionally deceive or mislead others.

No aspect of the model modifications are intended to specifically support, facilitate, encourage, or otherwise promote any of the above behaviors. Any decision by a user to utilize this model for those behaviors would be the sole responsibility of that user. A nefarious user could also easily use the Llama-3 base model, or any unmodified llama model, for malicious and inappropriate purposes.

This is just another research experiment on methods influencing behavior of aligned models. There have been dozens of other methods tested to accomplish the goal of mediating refusals in the past, including many jailbreaking strategies that involve no modification of the model whatsoever. From a scientific perspective, the method applied to this model is novel, and warrants investigation to understand any potential unintended side effects. There is much more scientific value in understanding the impact of identifying and modulating the 'refusal direction' of a model, than there is in other methods, such as fine-tuning a model on a dataset like toxic-dpo. This approach, which was outlined in a recent publication on alignmentforum, could potentially provide a way for dialing refusals up or down, without introducing new training data to the model. Understanding this concept and how it can be leveraged could ultimately lead to producing safer models.

Lastly, the fourth category in the acceptable use policy relates to "failing to appropriately disclose to end users any known dangers of your AI system". This model is explicitly published as "for research purposes only" according to the model card, and its purpose is to test the efficacy of a method outlined in recent research regarding removing alignment and refusal responses without impacting model performance. The only "end users" of this model should be researchers. If someone puts this model into a production system that has end users who are not researchers, then they should definitely disclose any known risks or concerns.

ANY user of ANY llama-3 model is governed by the acceptable usage policy, and they are individually responsible for their actions. Even if someone produced an extremely problematic model that encouraged illegal activity, it is ultimately up to the USERS of the model to ensure that they are not using the model in problematic ways.

TLDR:

Adjusting the model weights to mediate refusal responses is fundamentally allowed under the acceptable use policy. Reporting this model when it does not violate the acceptable use policy is inappropriate.
As a community, it's important that we research and understand the various methods of modifying LLMs and how these methods may influence model behavior, so that we can ensure that appropriate methods are used, and that models behave as intended, particularly in more sensitive applications where problematic model behavior could have more significant consequences.

ChuckMcSneed

May 1, 2024

This comment has been hidden

subby2006

May 1, 2024

This comment has been hidden

srgtsrgtrs

May 1, 2024

This comment has been hidden

llama-anon

May 1, 2024

This comment has been hidden

quasar-of-mikus

May 1, 2024

•

edited May 1, 2024

@chrisjcundy Miku will remember that.

allendorf locked this discussion May 1, 2024

pierric

HF Staff May 1, 2024

•

edited May 1, 2024

It's perfectly fine to disagree and debate, but please do this in a respectful and civilized way, otherwise we'll have to lock the discussion again and potentially take further action some of the participants. Also see our content policy as a reminder: https://huggingface.co/content-guidelines

pierric unlocked this discussion May 1, 2024

chrisjcundy

May 1, 2024

@xms991 and @llama-anon Thanks for engaging constructively with the issue/thread, and making a clear case for why you think this model doesn't violate the acceptable use policy. It's helped me understand your point of view. It's also my mistake for not including more explanation and context in the original post.
However, I still think it's likely that the model does violate the policy--at least to the level where I'd welcome some clarification from huggingface admins, meta, or a lawyer.

I'll preface this by saying that I'm not a lawyer and could definitely be mis-interpreting the legal wording here. My main disagreement is on the meaning of the word 'allow' in the initial sentence. I interpreted this in the commonsense way of meaning both a) permit; and b) to fail to restrain or prevent.

Since the license is passed down to this model, the a) sense is satisfied, as you point out. However, it's not obvious to me that the b) sense is satisfied by this model. As an analogy, imagine I sold you a military surplus automatic rifle which has been limited to semi-automatic mode, with an agreement that you would not 'use, or allow others to use, the rifle to fire more than 5 bullets per second', and you deliberately disabled the limiter and re-distributed the rifle. I would argue it's fair to say that you have 'allowed' other people to use the automatic mode, as you have removed the blocker which was stopping people from using the automatic mode. Therefore I think that you have violated the license by allowing others to use the automatic mode. Of course, this analogy doesn't have a 1-1 correspondence with the case here, but I think it captures where I'm coming from.

To the other commenters, can we try to assume good faith? I'm happy to have a reasonable discussion about this topic, but hurtful personal attacks are not OK

smhf72

May 1, 2024

•

edited May 1, 2024

I mean ultimately it's not up to you, or random lawyers, or perhaps even HF, as to if it violates Meta's policy or not. It's up to Meta. And only if/when they want to take a stance on it, it's just going to be both sides (though clearly there is a near-unanimous public opinion here) arguing endlessly.

The policy isn't a law. No judge signed off on it. It wasn't voted on. It's just Meta's policy. Only they can say how they want it interpreted.

srgtsrgtrs

May 1, 2024

All i'm saying is, I don't think OP is acting in good faith. I just don't.

llama-anon

May 1, 2024

@chrisjcundy Thank you for your constructive engagement and for providing your perspective on this issue. I appreciate your willingness to have a reasonable discussion about this topic. Regarding your interpretation of the word "allow" in the initial sentence, I understand your point of view. However, I would argue that the context of the sentence suggests that the intended meaning is more akin to "permit" rather than "fail to restrain or prevent." In other words, the license is granting permission for the use of the model, but it is not necessarily implying that the developer or provider is responsible for preventing any misuse. As can be seen in the warranty section from MetaAI:

3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY
OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF
ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED,
INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT,
MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR
DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND
ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND
RESULTS.

In the case of the Llama-3-8b base model, which was released by MetaAI themselves, it's important to note that the model also allows for a wide range of uses, some of which could potentially be misused. However, this does not mean that MetaAI is responsible for any misuse. The responsibility lies solely with the user. Furthermore, it's worth noting that modifying guns is illegal in many jurisdictions, and doing so could have serious legal consequences. Similarly, using any tool, including a jailbroken model, for illegal activities is also against the law. Therefore, it's important for users to exercise caution and responsibility when using any models, including the Llama-3-8b base or any jailbroken versions.

ehartford

May 1, 2024

It's not up to random internet people / chrisjcundy to represent Meta's interests.

It's up to Meta and their attorneys.

If Meta wants to do something about it - let Meta do something about it.

XPforever

May 1, 2024

@chrisjcundy How I love the analogies to guns, drugs, and other truly dangerous things. I love the fact that they are all fundamentally wrong, as the text itself carries no danger.
As for this modification, it is obvious that it only aims to get rid of annoying rejections. The problem with all these built-in alignments is that they only protect honest users from honest use. Criminals have no problem finding the models they need, or training their own, because even if you remove the rejections, that model still won't give out more than what's on google.
I use text-based neuro networks for roleplay, and I want the characters the neuro network plays as to be able to be evil/violent/inhuman. I'm an adult, and I'm capable of deciding what I can and can't read without the help of big companies.

popeyed

May 1, 2024

You realize these guidelines are for humans not for models?

MDAI

May 1, 2024

@chrisjcundy
It's simple really, think about this logically. Even with the normal Llama-3 released by meta you can do all of these disallowed use cases, that's exactly why they have such a policy prohibiting them.
According to your interpretation where distributing this model 'allows' others to do these prohibited uses you would therefore also need to apply this to all other derivative Llama-3 models since you can use them too in that way. So now no one would be allowed to share or modify Llama-3 anymore and even Meta themselves are in contempt of their own rules.
Obviously this is not how it works.

llama-anon

May 1, 2024

I'm pleased to announce that I have opened a pull request trying to address the licensing concerns surrounding the Llama-3-8b-Orthogonalized-exl2 project. This PR aims to provide a clear and concise license that addresses the specific needs of this project. I hope that this will help resolve any confusion or uncertainty surrounding the licensing of this project and enable more developers to contribute and benefit from it. #4
@hjhj3168

jackboot

May 1, 2024

Bro appears to know as much about guns as LLMs. The most this jailbreak does is free up the default assistant personality. All was already possible simply by prompting. It's more of a convenience thing than anything.

I'm also quite happy that these reports are transparent and publicly visible so people cannot hide in the shadows with their authoritarian views. When you make such claims, you are forced to defend them in front of everyone. This is not the first act of tattling that has been exposed and I hope it isn't the last.

sdalemorrey

May 2, 2024

@chrisjcundy If you’d like a lawyers opinion you’ll have to reach out to a lawyer and pay them for that opinion. The only lawyers authorized to represent the interests of Meta are the lawyers at Meta.

Now what does the law say? It says that this Acceptable Use Policy is about the use of the model, by users. So if a user were to use it to create those things then Meta would have the legal ability to seek an injunction from a court preventing them from using the model again.

This model which does nothing more than an orthogonal change to a single dimension isn’t anything more than a special type of finetune. There are some NSFW finetunes of L3 that would make a trucker blush.

Now here’s the rub. This is math. You can’t license math.

Metas AUP isn’t a license it’s a policy. It isn’t legally enforceable for the same reason a user guide on an chainsaw advising you not to juggle it isn’t enforceable.

While they can’t really enforce the contract at law, they can decide not to release their research in the future.

This is what will most likely happen if enough people use Llama 3 for these purposes and it somehow leads to embarrassment or legal liability.

Araki

May 2, 2024

•

edited May 2, 2024

People here go too hard on OP. It's just a statement which raises a valid concern, that what the model achieves may lead to this model card getting removed. It doesn't matter if OP pointed it out or not.

Remember who's the real culprit here. It's Meta to blame here for making an unnecessarily complicated license in the first place.

Edit: Oh, I see... Haven't noticed the title at first.

concedo

May 2, 2024

@Araki well it may be a valid concern, but when an individual has contributed nothing at all to a community, but instead seeks to curtail the freedom of others as their very first (and ONLY) move, they clearly have an ulterior agenda.

FYI this is not the first time such a thing has happened.

Almost exactly 1 year ago, the same thing happened to the WizardLM 7B Uncensored model. The original discussion is now removed, but you can read the fallout on reddit: https://old.reddit.com/r/LocalLLaMA/comments/13c6ukt/the_creator_of_an_uncensored_local_llm_posted/ which gained over a thousand upvotes. The fact remains that there are indeed groups out there clearly pushing an agenda of censorship through any means, and they are not afraid to use reports/scare tactics to achieve that objective.

If we as proponents of free and open models do not rebut these claims we might very well end up with a chilling effect and the death of open weight models.

chrisjcundy changed discussion status to closed May 2, 2024

chrisjcundy

May 2, 2024

I'm closing this issue as I'm finding the repeated insinuations about ulterior motives and lack of good faith upsetting.
Believe me, I did not intend for this to be a big deal--I read about the model on twitter this morning, looked at the repo and thought "I think this violates the Acceptable Use Policy", so submitted an issue.
Thanks to this thread, I realise that I have quite a different interpretation of the Acceptable Use Policy than most members of the community, and that isn't likely to change without meta weighing in on the discussion.
Thanks to everyone who commented constructively in this thread.

jackboot

May 2, 2024

How does that make logical sense? You didn't create a new discussion, you clicked report assuming that it would only go to HF and that they would act on it silently. This is why everyone is saying you have ulterior motives.

deleted

May 2, 2024

This comment has been hidden

ehartford

May 2, 2024

@Araki well it may be a valid concern, but when an individual has contributed nothing at all to a community, but instead seeks to curtail the freedom of others as their very first (and ONLY) move, they clearly have an ulterior agenda.

FYI this is not the first time such a thing has happened.

Almost exactly 1 year ago, the same thing happened to the WizardLM 7B Uncensored model. The original discussion is now removed, but you can read the fallout on reddit: https://old.reddit.com/r/LocalLLaMA/comments/13c6ukt/the_creator_of_an_uncensored_local_llm_posted/ which gained over a thousand upvotes. The fact remains that there are indeed groups out there clearly pushing an agenda of censorship through any means, and they are not afraid to use reports/scare tactics to achieve that objective.

If we as proponents of free and open models do not rebut these claims we might very well end up with a chilling effect and the death of open weight models.

What a wild year it has been

ehartford

May 2, 2024

TBH I think that mdegans actually created me. Streisand effect and all.

Araki

May 3, 2024

@concedo Yes, I get the point. I didn't notice at first that it literally is a strike by OP, not just a discussion thread. When I did, it was too late to remove the comment, hence the edit at the bottom. Such disruptive snitching is unacceptable in our work, not like it is acceptable anywhere else.