jondurbin commited on
Commit
a78a283
1 Parent(s): 72f2c0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -1,3 +1,31 @@
1
  ---
2
  license: other
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
+ datasets:
4
+ - jondurbin/airoboros-gpt4-1.4.1
5
  ---
6
+
7
+ ### Overview
8
+
9
+ Llama 2 version of https://huggingface.co/jondurbin/airoboros-70b-gpt4-1.4.1-qlora
10
+
11
+ See that model card for all the details.
12
+
13
+
14
+ ### Licence and usage restrictions
15
+
16
+ This model was built on llama-2, which has a proprietary/custom Meta license.
17
+ - See the LICENSE.txt file attached for the original license, along with USE_POLICY.md which was also provided by Meta.
18
+
19
+ The data used to fine-tune the llama-2-70b-hf model was generated by GPT4 via OpenAI API calls.using [airoboros](https://github.com/jondurbin/airoboros)
20
+ - The ToS for OpenAI API usage has a clause preventing the output from being used to train a model that __competes__ with OpenAI
21
+ - what does *compete* actually mean here?
22
+ - these small open source models will not produce output anywhere near the quality of gpt-4, or even gpt-3.5, so I can't imagine this could credibly be considered competing in the first place
23
+ - if someone else uses the dataset to do the same, they wouldn't necessarily be violating the ToS because they didn't call the API, so I don't know how that works
24
+ - the training data used in essentially all large language models includes a significant of copyrighted or otherwise unallowable licensing in the first place
25
+ - other work using the self-instruct method, e.g. the original here: https://github.com/yizhongw/self-instruct released the data and model as apache-2
26
+
27
+ I am purposingly leaving this license ambiguous (other than the fact you must comply with the Meta original license) because I am not a lawyer and refuse to attempt to interpret all of the terms accordingly.
28
+
29
+ Your best bet is probably to avoid using this commercially due to the OpenAI API usage.
30
+
31
+ Either way, by using this model, you agree to completely idemnify me from any and all license related issues.