Yikang Shen's picture

7 8 1

Yikang Shen

YikangS

·

yikangshen

AI & ML interests

None yet

Organizations

YikangS's activity

upvoted a collection 5 months ago

Granite 3.0 Language Models

A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 18 days ago • 96

upvoted a collection 7 months ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17, 2024 • 15

upvoted a paper 7 months ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 24

upvoted 2 papers 9 months ago

Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 20

upvoted a collection 10 months ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 18 days ago • 184

upvoted 2 papers about 1 year ago

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30

The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14