2 33 12

Junjie Chen

coderchen01

https://junjie-chen.info

AI & ML interests

Efficient AI, Multimodal AI, Generative AI

Recent Activity

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

liked a dataset about 2 months ago

microsoft/SCBench

liked a Space about 2 months ago

HuggingFaceH4/blogpost-scaling-test-time-compute

View all activity

Organizations

None yet

coderchen01's activity

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 131

upvoted a paper 2 months ago

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 126

upvoted a paper 3 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 51

upvoted an article 3 months ago

Article

Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI.

•

May 21, 2024

• 35

upvoted 3 papers 3 months ago

SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12

Top-nσ: Not All Logits Are You Need

Paper • 2411.07641 • Published Nov 12, 2024 • 20

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 48

upvoted an article 4 months ago

Article

🕳️ Attention Sinks in LLMs for endless fluency

•

Oct 9, 2023

• 7

upvoted a paper 4 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 145

upvoted 2 articles 4 months ago

Article

Scaling AI-based Data Processing with Hugging Face + Dask

Oct 9, 2024

• 28

Article

How 🤗 Accelerate runs very large models thanks to PyTorch

Sep 27, 2022

• 10

upvoted 2 papers 4 months ago

MLP-KAN: Unifying Deep Representation and Function Learning

Paper • 2410.03027 • Published Oct 3, 2024 • 29

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2, 2024 • 26

upvoted 2 papers 5 months ago

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

Paper • 2409.13592 • Published Sep 20, 2024 • 49

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published Sep 10, 2024 • 56

upvoted a paper 6 months ago

POA: Pre-training Once for Models of All Sizes

Paper • 2408.01031 • Published Aug 2, 2024 • 27

upvoted 2 papers 7 months ago

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 76

OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects

Paper • 2407.08711 • Published Jul 11, 2024 • 6

upvoted a collection 7 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 227

upvoted a paper 7 months ago

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 36