1 2 10

Shrirang Mahajan

NotShrirang

https://www.shrirangmahajan.in/

AI & ML interests

Deep Learning, LLMs, Machine Learning, Generative AI

Recent Activity

updated a model 10 days ago

NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT

published a model 12 days ago

NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1-Distill-Llama-70B

View all activity

Organizations

NotShrirang's activity

updated a model 10 days ago

NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT

Text Generation • Updated 10 days ago • 1

published a model 12 days ago

NotShrirang/DeepSeek-R1-Distill-Qwen-1.5B-SQL-Coder-PEFT

Text Generation • Updated 10 days ago • 1

liked 3 models about 1 month ago

@ariG23498 Hey, I read the article again, and it feels a lot easier to read. Kudos to your quick response! I know changing something you have put efforts into, is not easy.

Thanks for directly aligning with my preference!
Loss: 📉

reacted to burtenshaw's post with 🔥 about 2 months ago

Post

46518

We’re launching a FREE and CERTIFIED course on Agents!

We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.

Here's what you'll learn:

- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience

This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.

Enroll today and start building the next generation of AI agent applications!

https://bit.ly/hf-learn-agents

29 replies

commented on Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) about 2 months ago

Great explanation! How they were able to convert an optimization problem into a differentiable equation is just amazing!
I was recently trying to understand what DPO does under the hood and I watched this video by @hkproj . Great work!

Also, just filling in for newbies like me:

The maximization equation in 3rd step in Reformulating the RLHF Objective
We divide the maximization equation with −β and because of the - sign, it becomes minimization problem.
In (Introducing the Partition Function), Z(x) is a normalization constant. I wasn't able to understand how this term Z(x) came into picture and how it is substituted. So I asked ChatGPT and I got this!

This makes little bit of sense, but I have not verified whether this is correct or not.
There are some helpful steps in "Mathematical Derivations" section in the DPO paper: https://arxiv.org/pdf/2305.18290