Kashif Rasul's picture

Kashif Rasul

kashif

·

AI & ML interests

Time Series Forecasting, Denoising Diffusion, Generative Modeling, Reinforcement Learning

Recent Activity

commented on their article about 7 hours ago

The Annotated Diffusion Model

liked a model 6 days ago

kernels-community/activation

new activity 6 days ago

google/timesfm-2.0-500m-pytorch:updated weight and config argument weights

View all activity

Organizations

kashif's activity

upvoted a paper 17 days ago

MONSTER: Monash Scalable Time Series Evaluation Repository

Paper • 2502.15122 • Published 21 days ago • 2

upvoted 2 articles about 1 month ago

Article

Open R1: Update #2

By

and 6 others •

Feb 10

• 202

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

By

•

Jan 31

• 43

upvoted a paper 2 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 263

upvoted an article 2 months ago

Article

Process Reinforcement through Implicit Rewards

By

and 1 other •

Jan 3

• 25

upvoted 2 papers 3 months ago

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Paper • 2407.00079 • Published Jun 24, 2024 • 5

RRM: Robust Reward Model Training Mitigates Reward Hacking

Paper • 2409.13156 • Published Sep 20, 2024 • 5

upvoted a paper 5 months ago

A Rate-Distortion View of Uncertainty Quantification

Paper • 2406.10775 • Published Jun 16, 2024 • 1

upvoted 2 papers 6 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 138

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7, 2024 • 13

upvoted a collection 7 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17, 2024 • 15

upvoted 3 papers 7 months ago

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13, 2024 • 21

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 119

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Paper • 2405.21046 • Published May 31, 2024 • 4

upvoted 4 articles 9 months ago

Article

Putting RL back in RLHF

Jun 12, 2024

• 84

Article

🧨 Diffusers welcomes Stable Diffusion 3

Jun 12, 2024

• 94

Article

The Annotated Diffusion Model

Jun 7, 2022

• 159

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27, 2024

• 129

upvoted 2 papers 9 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 93

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

Paper • 2406.12925 • Published Jun 14, 2024 • 24