Why Not Utilize a Sigmoid Function in the Regression Layer?

by xwz-xmu - opened 12 days ago

12 days ago

•

The regression layer typically performs a linear transformation, which does not inherently constrain the range of the predicted rewards. Consequently, the model may output extremely large positive values or even negative ones, whereas the ground-truth rewards in the dataset are normalized to the range [0, 1].

I am curious whether omitting a sigmoid function in the regression layer could negatively impact the performance of multi-objective reward modeling.

xwz-xmu changed discussion title from why not use a sigmoid function on regression layer? to Why Not Utilize a Sigmoid Function in the Regression Layer? 12 days ago

Haoxiang-Wang

RLHFlow org 12 days ago

You can try sigmoid. I tried logistic regression loss before (with sigmoid), and did not find it outperform regression. The regression on Llama backbone is pretty stable.

Haoxiang-Wang changed discussion status to closed 12 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment