siqi zhu's picture

5 5

siqi zhu

zsqzz

·

zhusq20

AI & ML interests

None yet

Recent Activity

authored a paper 19 days ago

Efficiently Serving LLM Reasoning Programs with Certaindex

upvoted a paper 21 days ago

Efficiently Serving LLM Reasoning Programs with Certaindex

upvoted a collection 24 days ago

Synthetic Data and Self-Improvement

View all activity

Organizations

zsqzz's activity

authored a paper 19 days ago

Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published 21 days ago • 35

upvoted a paper 21 days ago

Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published 21 days ago • 35

upvoted a collection 24 days ago

Synthetic Data and Self-Improvement

53 items • Updated 4 days ago • 4

liked a model about 2 months ago

AIDC-AI/Marco-o1

Text Generation • Updated Nov 23, 2024 • 5.33k • 700

reacted to di-zhang-fdu's post with 👍 3 months ago

Post

6398

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF?
Just a little bite of strawberry!🍓

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

2 replies

·

upvoted a paper 4 months ago

On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 13

updated 2 datasets 4 months ago

LLMFlywheel/GSM8K-QWEN2.5

Updated Sep 23, 2024 • 4

zsqzz/tora_gsm8k

Updated Sep 20, 2024 • 1

updated a dataset 5 months ago

zsqzz/mbpp-new-dataset

Viewer • Updated Sep 7, 2024 • 974 • 33

upvoted 2 papers 5 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 66

Efficient LLM Scheduling by Learning to Rank

Paper • 2408.15792 • Published Aug 28, 2024 • 20

authored 2 papers 5 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 66

Efficient LLM Scheduling by Learning to Rank

Paper • 2408.15792 • Published Aug 28, 2024 • 20

updated 3 datasets 5 months ago

zsqzz/qwen

Viewer • Updated Aug 28, 2024 • 24 • 12

zsqzz/qwen1.5b

Updated Aug 28, 2024 • 2

zsqzz/qqwen7b

Updated Aug 28, 2024 • 1

liked a model 5 months ago

Qwen/Qwen2-Math-7B-Instruct

Text Generation • Updated Aug 12, 2024 • 551 • 41

updated a model 6 months ago

zsqzz/opt-125m

Updated Aug 1, 2024

updated 2 models 8 months ago

zsqzz/0520-distillbert-llama3-70b-class-trainbucketsize820-sharegpt

Updated May 20, 2024

zsqzz/0520-distillbert-llama3-70b-class-trainbucketsize820-lmsys

Updated May 20, 2024