reciprocate
/

openllama-13b_rm_oasst-hh

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

reciprocate commited on Jun 21, 2023

Commit

41f353d

•

1 Parent(s): a91f060

Create README.md

Files changed (1) hide show

README.md +28 -0

README.md ADDED Viewed

	@@ -0,0 +1,28 @@

+---
+license: apache-2.0
+language:
+- en
+---
+OpenLLama-13B for reward modeling
+- Dataset: https://huggingface.co/datasets/pvduy/rm_oa_hh
+- Logs: https://wandb.ai/sorry/autocrit/runs/j05t4e97?workspace=user-sorry
+- Code: https://github.com/CarperAI/autocrit/blob/main/train_reward_model.py
+Usage:
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+ckpt = "reciprocate/openllama-13b_rm_oasst-hh"
+model = AutoModelForSequenceClassification.from_pretrained(ckpt, load_in_4bit=True)
+tokenizer = AutoTokenizer.from_pretrained(ckpt)
+model(**tokenizer("ASSISTANT: This sentence is a lie.", return_tensors="pt"))[0].item()
+```
+Output:
+```python
+-1.626953125
+```