llm-course-hw2 - a tsessk Collection

tsessk 's Collections

updated 7 days ago

llm course @ HSE and vk llm A collection of SmolLM-135M models fine-tuned with DPO, PPO, and Reward Modeling to enhance human-like expressiveness