theblackcat102
/

electra-large-reward-model

Text Classification

Inference Endpoints

Model card Files Files and versions Community

electra-large-reward-model / README.md

theblackcat102's picture

Update README.md

c135a1c about 2 years ago

|

history blame contribute delete

687 Bytes

	---
	language:
	- en
	tags:
	- webgpt
	- regression
	- reward-model
	license: apache-2.0
	datasets:
	- openai/webgpt_comparisons
	- openai/summarize_from_feedback
	metrics:
	- accuracy
	---

	Reward Model pretrained on openai/webgpt_comparison and humanfeedback summary. Unlike the other electra-large model this model is trained using rank loss with one more datasets.

	On validation dataset the result is much more stable than usual.

	You can refer to this [wandb](https://wandb.ai/theblackcat102/reward-model/runs/1d4e4oi2?workspace=) for more details


	Slightly better than previous webgpt only model : [electra-large](https://huggingface.co/theblackcat102/electra-large-webgpt-rm)