InfiniAI Lab

university

https://keroro824.github.io/lab-page/

InfiniAILab

Activity Feed Request to join this org

AI & ML interests

ML algorithms and systems

Recent Activity

Zhuominc updated a model 3 days ago

InfiniAILab/CodeDrafter-500M

Zhuominc updated a model 3 days ago

InfiniAILab/CodeDrafter-500M

beidic authored a paper 8 months ago

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

View all activity

InfiniAILab's activity

Zhuominc

updated a model 3 days ago

InfiniAILab/CodeDrafter-500M

Text Generation • Updated 3 days ago • 86

beidic

authored a paper 8 months ago

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Paper • 2405.19325 • Published May 29, 2024 • 14

beidic

authored 3 papers 9 months ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 75

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18, 2024 • 16

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12, 2024 • 64

beidic

authored a paper 10 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 183

beidic

authored a paper about 1 year ago

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Paper • 2310.17157 • Published Oct 26, 2023 • 12

beidic

authored a paper over 1 year ago

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Paper • 2303.06865 • Published Mar 13, 2023 • 1