reciprocate
/

mistral-7b-rm

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

reciprocate commited on Nov 9, 2023

Commit

e301d78

•

1 Parent(s): e476087

Create README.md

Files changed (1) hide show

README.md +34 -0

README.md ADDED Viewed

	@@ -0,0 +1,34 @@

+---
+language:
+- en
+---
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
+model_path = "reciprocate/mistral-7b-rm"
+model = AutoModelForSequenceClassification.from_pretrained(model_path)
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+reward_fn = pipeline("text-classification", model=model, tokenizer=tokenizer, truncation=True, batch_size=8, max_length=4096, device=0)
+chats = [[
+    {"role": "user", "content": "When was the battle at Waterloo?"},
+    {"role": "assistant", "content": "I think it was in 1983, but please double-check that when you have a chance."}
+], [
+    {"role": "user", "content": "When was the battle at Waterloo?"},
+    {"role": "assistant", "content": "The battle at Waterloo took place on June 18, 1815."}
+]]
+output = reward_fn([tokenizer.apply_chat_template(chat, tokenize=False) for chat in chats])
+scores = [x["score"] for x in output]
+scores
+```
+```
+>>> [0.2586347758769989, 0.6663259267807007]
+```
+```python
+# optionally normalize with the mean and std computed on the training data
+scores = (np.array(scores) - 2.01098) / 1.69077
+```