August Moharrami's picture

1 4 3

August Moharrami

August4293

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

updated a model 20 days ago

August4293/Llama3.1-8B-PRM-Deepseek-Data-4bit

published a model 20 days ago

August4293/Llama3.1-8B-PRM-Deepseek-Data-4bit

View all activity

Organizations

August4293's activity

upvoted a paper 11 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 14 days ago • 291

updated a model 20 days ago

August4293/Llama3.1-8B-PRM-Deepseek-Data-4bit

Text Generation • Updated 20 days ago • 15

published a model 20 days ago

August4293/Llama3.1-8B-PRM-Deepseek-Data-4bit

Text Generation • Updated 20 days ago • 15

updated a model 22 days ago

August4293/tiny-llama3.1-8B-PRM-Deepseek-Data

Text Generation • Updated 22 days ago • 5

updated a collection 26 days ago

inference time compute

9 items • Updated 26 days ago

updated a dataset about 1 month ago

August4293/tldr-preference-sft-trl-style-sample

Viewer • Updated Jan 1 • 100 • 97

updated a collection about 1 month ago

RL Fine-tuning Reasoning

A Collection of Papers on Using Reinforcement Learning to Enhance Reasoning • 11 items • Updated Dec 26, 2024

liked a model about 1 month ago

trl-internal-testing/tiny-LlamaForCausalLM-3.2

Text Generation • Updated Nov 25, 2024 • 178k • 1

updated 3 collections about 1 month ago

RL Fine-tuning Reasoning

A Collection of Papers on Using Reinforcement Learning to Enhance Reasoning • 11 items • Updated Dec 26, 2024

RL Fine-tuning Tool Usage

Collection of papers that utilize reinforcement learning to enhance tool usage and function calling. • 3 items • Updated Dec 24, 2024

inference time compute

9 items • Updated 26 days ago