Manuel Romero's picture

Manuel Romero PRO

mrm8488

·

https://mrm8488.github.io

AI & ML interests

#AI Research and Democratization. NLP/NLG 🤗

Recent Activity

upvoted a paper 1 day ago

START: Self-taught Reasoner with Tools

upvoted a collection 4 days ago

🧠 Reasoning datasets

liked a model 4 days ago

sesame/csm-1b

View all activity

Organizations

mrm8488's activity

upvoted a paper 1 day ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 11 days ago • 94

upvoted a collection 4 days ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 14 items • Updated 6 days ago • 107

upvoted a collection 6 days ago

👩‍💻 OlympicCoder

Reasoning datasets and models for competitive coding • 4 items • Updated 6 days ago • 10

upvoted a collection 11 days ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 4 days ago • 96

upvoted an article 20 days ago

Article

FastRTC: The Real-Time Communication Library for Python

21 days ago

• 145

upvoted a paper 29 days ago

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 60

upvoted 2 papers about 1 month ago

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Paper • 2502.05167 • Published Feb 7 • 15

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 205

upvoted a collection about 2 months ago

WildChat-50m

All model responses associated with the WildChat-50m paper. • 55 items • Updated Jan 29 • 7

upvoted an article about 2 months ago

Article

Welcome to Inference Providers on the Hub 🔥

Jan 28

• 437

upvoted a collection 2 months ago

Financial Sentiment Analysis 💲📈

Financial Sentiment Analysis models I created • 3 items • Updated Jan 16 • 4

upvoted 3 papers 2 months ago

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published Jul 1, 2024 • 61

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

Paper • 2408.04303 • Published Aug 8, 2024 • 20

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 50

upvoted a paper 3 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 135

upvoted a collection 3 months ago

📚 FineWeb-Edu

FineWeb-Edu datasets, classifier and ablation model • 5 items • Updated Jun 12, 2024 • 13

upvoted a paper 3 months ago

GEITje 7B Ultra: A Conversational Model for Dutch

Paper • 2412.04092 • Published Dec 5, 2024 • 3