29 10 28

Alex Chen PRO

alexchen4ai

https://alexchen4ai.github.io/blog/

AI & ML interests

NLP

Recent Activity

liked a model 5 days ago

NexaAIDev/DeepSeek-R1-Distill-Llama-8B-NexaQuant

liked a model 5 days ago

NexaAIDev/DeepSeek-R1-Distill-Qwen-1.5B-NexaQuant

liked a model 26 days ago

deepseek-ai/Janus-Pro-7B

View all activity

Organizations

alexchen4ai's activity

upvoted a paper 2 months ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41

upvoted a collection 5 months ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

upvoted an article 5 months ago

Article

Introduction to ggml

Aug 13, 2024

• 155

upvoted a collection 5 months ago

Molmo

Collection

Artifacts for open multimodal language models. • 5 items • Updated 12 days ago • 296

upvoted a paper 6 months ago

Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28, 2024 • 43

upvoted an article 6 months ago

Article

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Apr 5, 2023

• 27

upvoted 2 papers 8 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3, 2024 • 93

Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48

upvoted a paper 10 months ago

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117

upvoted a paper 11 months ago

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2, 2024 • 57