I've been working on a Space to make it super easy to create notebooks and help users quickly understand and manipulate their data! With just a few clicks automatically generate notebooks for:
📊 Exploratory Data Analysis 🧠 Text Embeddings 🤖 Retrieval-Augmented Generation (RAG)
✨ Automatic training is coming soon! Check it out here asoria/auto-notebook-creator Appreciate any feedback to improve this tool 🤗
🚀 We will be generating a preference dataset for DPO/ORPO and cleaning it with AI feedback during our upcoming meetup!
In this session, we'll walk you through the essentials of building a distilabel pipeline by exploring two key use cases: cleaning an existing dataset and generating a preference dataset for DPO/ORPO. You’ll also learn how to make the most of AI feedback, integrating Argilla to gather human feedback and improve the overall data quality.
This session is perfect for you - if you’re getting started with distilabel or synthetic data - if you want to learn how to use LLM inference endpoints for **free** - if you want to discover new functionalities - if you want to provide us with new feedback
Is your summer reading list still empty? Curious if an LLM can generate a book blurb you'd enjoy and help build a KTO preference dataset at the same time?