Xi's picture

Xi

xi0v

·

AI & ML interests

Reinforcement learning, Diffusion Model Merging, LLM Merging, Model Editing and Vision/Multimodal Model Fine-tuning.

Recent Activity

liked a model about 19 hours ago

parsee-mizuhashi/civit-mirroring

liked a model 1 day ago

John6666/chroma-fusion-c2illustrious-sdxl

liked a Space 2 days ago

CaioXapelaum/GGUF-Playground

View all activity

Organizations

xi0v's activity

upvoted a paper 4 days ago

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Paper • 2502.06589 • Published 7 days ago • 16

upvoted a paper 5 days ago

Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Paper • 2502.06060 • Published 8 days ago • 29

upvoted 3 papers 6 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 7 days ago • 125

CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference

Paper • 2502.04416 • Published 11 days ago • 10

CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

Paper • 2502.04350 • Published 13 days ago • 10

upvoted an article 11 days ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

14 days ago

• 99

upvoted an article 13 days ago

Article

Open-source DeepResearch – Freeing our search agents

14 days ago

• 998

upvoted a collection 13 days ago

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 5 days ago • 90

upvoted a paper 13 days ago

AIN: The Arabic INclusive Large Multimodal Model

Paper • 2502.00094 • Published 17 days ago • 16

upvoted an article 14 days ago

Article

Replicating DeepSeek R1 for Information Extraction

By

•

17 days ago

• 34

upvoted a collection 17 days ago

FuseO1-Preview

System-II Reasoning Fusion of LLMs • 10 items • Updated 17 days ago • 17

upvoted an article 18 days ago

Article

Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e

Oct 3, 2023

• 8

upvoted an article 19 days ago

Article

Honesty, Open Source, and the Future of AI in Art: An Open Question

By

•

20 days ago

• 4

upvoted 2 articles 20 days ago

Article

Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype

By

•

20 days ago

• 4

Article

Open-R1: a fully open reproduction of DeepSeek-R1

21 days ago

• 749

upvoted an article 21 days ago

Article

FuseO1-Preview: System-II Reasoning Fusion of LLMs

By

and 4 others •

28 days ago

• 14

upvoted a paper 21 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 26 days ago • 319

upvoted a collection 22 days ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 22 days ago • 99

upvoted an article 22 days ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 691

upvoted a paper 23 days ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published 25 days ago • 36