tsessk 's Collections

llm-course-hw2

llm course @ HSE and vk llm A collection of SmolLM-135M models fine-tuned with DPO, PPO, and Reward Modeling to enhance human-like expressiveness