Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
5
ROHITH VENKATA REDDY
knight7561
Follow
Stefopim's profile picture
Swadhin12's profile picture
2 followers
·
17 following
AI & ML interests
Deep learning, Autonomous Driving
Recent Activity
updated
a model
7 days ago
knight7561/SmolLM2-eli5_precomputed_top_slice
updated
a model
8 days ago
knight7561/SmolLM2-FT-MyDataset
reacted
to
cfahlgren1
's
post
with ❤️
about 1 month ago
You can clean and format datasets entirely in the browser with a few lines of SQL. In this post, I replicate the process @mlabonne used to clean the new https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1 dataset. The cleaning process consists of: - Joining the separate splits together / add split column - Converting string messages into list of structs - Removing empty system prompts https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset Here's his new cleaned dataset: https://huggingface.co/datasets/mlabonne/orca-agentinstruct-1M-v1-cleaned
View all activity
Organizations
models
3
Sort: Recently updated
knight7561/SmolLM2-eli5_precomputed_top_slice
Text Generation
•
Updated
7 days ago
•
4
knight7561/SmolLM2-FT-MyDataset
Text Generation
•
Updated
8 days ago
•
7
knight7561/dummy
Fill-Mask
•
Updated
Sep 12
•
6
datasets
None public yet