29 4 33

Haoxiang Wang

Haoxiang-Wang

https://haoxiang-wang.github.io/

AI & ML interests

Machine Learning (Transfer Learning, OOD Generalization, Domain Adaptation, Meta-Learning)

Recent Activity

updated a model about 18 hours ago

nvidia/Cosmos-1.0-Guardrail

updated a model 1 day ago

nvidia/Cosmos-1.0-Tokenizer-DV8x16x16

updated a model 1 day ago

nvidia/Cosmos-1.0-Tokenizer-CV8x8x8

View all activity

Organizations

Haoxiang-Wang's activity

New activity in nvidia/Cosmos-0.1-Tokenizer-DV8x16x16 2 months ago

Update README.md

#1 opened 2 months ago by

Haoxiang-Wang

New activity in sfairXC/FsfairX-LLaMA3-RM-v0.1 3 months ago

Update README.md

#6 opened 3 months ago by

Haoxiang-Wang

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 4 months ago

Why is the code-complexity coefficient so high in the demo example?

#16 opened 4 months ago by

icdt

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 5 months ago

Special tokens in the vocabulary?

#13 opened 6 months ago by

nshen7

Original reward space

#15 opened 5 months ago by

anjaa

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 6 months ago

[AUTOMATED] Model Memory Requirements

#5 opened 7 months ago by

model-sizer-bot

What is the range of the output score from the model?

#12 opened 6 months ago by

nshen7

Why is `multi_obj_rewards` multipled by 5, but then 0.5 is subtracted from it?

#11 opened 6 months ago by

xzuyn

Update README.md

#3 opened 7 months ago by

philschmid

Issue when finetuning the reward model on custom dataset

#2 opened 7 months ago by

yguooo

Longer context

#10 opened 6 months ago by

salazaaar

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 7 months ago

batched predictions with padding through the model don't seem to work correctly

#7 opened 7 months ago by

karthikramen

ModuleNotFoundError: No module named 'transformers_modules.RLHFlow.ArmoRM-Llama3-8B-v0'

#6 opened 7 months ago by

fchaubard

Why Not Utilize a Sigmoid Function in the Regression Layer?

#8 opened 7 months ago by

xwz-xmu

New activity in allenai/reward-bench 7 months ago

Separate Scores: With & Without Prior Sets

#6 opened 7 months ago by

Haoxiang-Wang

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 8 months ago

Problem running the model

#1 opened 8 months ago by

Asaf-Yehudai

New activity in RLHFlow/LLaMA3-iterative-DPO-final 8 months ago

exl2 quants

#2 opened 8 months ago by

Apel-sin

New activity in RLHFlow/pair-preference-model-LLaMA3-8B 8 months ago

CAn you specify the license for this model please ?

#1 opened 8 months ago by

sparsh35

commented a paper 8 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 66 •

New activity in prometheus-eval/Feedback-Bench 9 months ago

Data Description

#2 opened 9 months ago by

Haoxiang-Wang