2 7 6

Daniel Wang

DanielWang

benywon

AI & ML interests

Natural Language Processing, Machine Learning

Recent Activity

authored a paper 13 days ago

Base of RoPE Bounds Context Length

authored a paper 13 days ago

Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs

authored a paper 13 days ago

BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

View all activity

Organizations

DanielWang's activity

authored 6 papers 13 days ago

Base of RoPE Bounds Context Length

Paper • 2405.14591 • Published May 23

Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs

Paper • 2406.09367 • Published Jun 13

BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Paper • 2408.15079 • Published Aug 27 • 52

commented a paper 15 days ago

KV Shifting Attention Enhances Language Modeling

Paper • 2411.19574 • Published 24 days ago • 8 •

upvoted a paper 15 days ago

KV Shifting Attention Enhances Language Modeling

Paper • 2411.19574 • Published 24 days ago • 8

upvoted 3 papers 9 months ago

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 64

Language models scale reliably with over-training and on downstream tasks

Paper • 2403.08540 • Published Mar 13 • 14

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 49

upvoted 2 papers 10 months ago

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12 • 39

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11 • 90

authored 3 papers 10 months ago

T2Ranking: A large-scale Chinese Benchmark for Passage Ranking

Paper • 2304.03679 • Published Apr 7, 2023

Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis

Paper • 2009.07964 • Published Sep 16, 2020

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 61

updated a collection 10 months ago

LLM

Collection

1 item • Updated Mar 7

upvoted a paper 10 months ago

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 61

liked a model over 1 year ago

baichuan-inc/Baichuan2-13B-Chat

Text Generation • Updated Feb 26 • 77.6k • 425