reciprocate
/

openllama-13b_rm_oasst-hh

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

openllama-13b_rm_oasst-hh / README.md

reciprocate's picture

Create README.md

41f353d over 1 year ago

|

history blame contribute delete

692 Bytes

	---
	license: apache-2.0
	language:
	- en
	---

	OpenLLama-13B for reward modeling

	- Dataset: https://huggingface.co/datasets/pvduy/rm_oa_hh
	- Logs: https://wandb.ai/sorry/autocrit/runs/j05t4e97?workspace=user-sorry
	- Code: https://github.com/CarperAI/autocrit/blob/main/train_reward_model.py

	Usage:

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer

	ckpt = "reciprocate/openllama-13b_rm_oasst-hh"
	model = AutoModelForSequenceClassification.from_pretrained(ckpt, load_in_4bit=True)
	tokenizer = AutoTokenizer.from_pretrained(ckpt)

	model(**tokenizer("ASSISTANT: This sentence is a lie.", return_tensors="pt"))[0].item()
	```

	Output:
	```python
	-1.626953125
	```