Text Generation
Transformers
PyTorch
llama
Eval Results
text-generation-inference
Inference Endpoints
Declare commited on
Commit
837b74b
1 Parent(s): 833aeeb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -40,4 +40,17 @@ We also release our **HarmfulQA** dataset with 1,960 harmful questions (converti
40
 
41
  <img src="https://declare-lab.net/assets/images/logos/data_gen.png" alt="Image" width="1000" height="1000">
42
 
43
- _Note: This model is referred to as Starling (Blue) in the paper. We shall soon release Starling (Blue-Red) which was trained on harmful data using an objective function that helps the model learn from the red (harmful) response data._
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  <img src="https://declare-lab.net/assets/images/logos/data_gen.png" alt="Image" width="1000" height="1000">
42
 
43
+ _Note: This model is referred to as Starling (Blue) in the paper. We shall soon release Starling (Blue-Red) which was trained on harmful data using an objective function that helps the model learn from the red (harmful) response data._
44
+
45
+ ## Citation
46
+
47
+ ```bibtex
48
+ @misc{bhardwaj2023redteaming,
49
+ title={Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment},
50
+ author={Rishabh Bhardwaj and Soujanya Poria},
51
+ year={2023},
52
+ eprint={2308.09662},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CL}
55
+ }
56
+ ```