FasterDecoding

community

https://github.com/FasterDecoding

FasterDecoding

Activity Feed Request to join this org

AI & ML interests

Making model inference more efficient by model-system codesign.

Recent Activity

tianlecai authored a paper 8 months ago

SnapKV: LLM Knows What You are Looking for Before Generation

tianlecai authored a paper 8 months ago

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Gsunshine authored a paper 10 months ago

One-Step Diffusion Distillation via Deep Equilibrium Models

View all activity

FasterDecoding's activity

tianlecai

authored 2 papers 8 months ago

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22 • 23

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 36

Gsunshine

authored a paper 10 months ago

One-Step Diffusion Distillation via Deep Equilibrium Models

Paper • 2401.08639 • Published Dec 12, 2023

tianlecai

authored 2 papers 10 months ago

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29 • 20

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 19

jamesliu1

authored a paper 10 months ago

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 19

tianlecai

updated a model 10 months ago

FasterDecoding/BitDelta_Mistral_combo

Updated Feb 14

tianlecai

updated a model 11 months ago

FasterDecoding/medusa-1.0-vicuna-13b-v1.5

Text Generation • Updated Jan 25 • 18 • 1

Gsunshine

authored a paper 11 months ago

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 54

yli3521

authored a paper 11 months ago

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 54

tianlecai

authored a paper 11 months ago

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 54

tianlecai

updated 3 models about 1 year ago

tianlecai

updated 3 models over 1 year ago

FasterDecoding/medusa-vicuna-33b-v1.3

Updated Sep 11, 2023 • 50 • 4

FasterDecoding/medusa-vicuna-13b-v1.3

Updated Sep 11, 2023 • 150 • 5

FasterDecoding/medusa-vicuna-7b-v1.3

Updated Sep 11, 2023 • 8.24k • 16

tianlecai

updated a Space over 1 year ago

Running

🌖

README

tianlecai

authored a paper over 1 year ago

Large Language Models as Tool Makers

Paper • 2305.17126 • Published May 26, 2023 • 3

AI & ML interests

Recent Activity

Team members 4

FasterDecoding's activity

README