AI & ML interests

Deep Learning, Computer Vision, Machine Learning

pyimagesearch's activity

ariG23498Β 
posted an update about 1 month ago
view post
Post
2045
Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo
ariG23498Β 
posted an update about 1 month ago
ariG23498Β 
posted an update 3 months ago
ariG23498Β 
posted an update 3 months ago
ariG23498Β 
posted an update 4 months ago
ariG23498Β 
posted an update 6 months ago