4 206 48

Charles I Niswander II

charlesniswander

dhar174

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

upvoted a paper 6 days ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

upvoted a paper 8 days ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

View all activity

Organizations

None yet

charlesniswander's activity

upvoted a paper 2 days ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 3 days ago • 58

upvoted a paper 6 days ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published 9 days ago • 25

upvoted 2 papers 8 days ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 40

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28, 2024 • 12

upvoted a collection 8 days ago

Foundation AI Papers

Collection

Curated List of Must-Reads on LLM reasoning at Temus AI team • 135 items • Updated Jun 15, 2024 • 31

upvoted 15 papers 8 days ago

Farewell to Length Extrapolation, a Training-Free Infinite Context with Finite Attention Scope

Paper • 2407.15176 • Published Jul 21, 2024 • 1

LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models

Paper • 2409.00509 • Published Aug 31, 2024 • 39

Neurocache: Efficient Vector Retrieval for Long-range Language Modeling

Paper • 2407.02486 • Published Jul 2, 2024 • 1

LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2, 2024 • 27

Engineering A Large Language Model From Scratch

Paper • 2401.16736 • Published Jan 30, 2024 • 2

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

Paper • 2311.12351 • Published Nov 21, 2023 • 4

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 64

Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

Paper • 2403.00818 • Published Feb 26, 2024 • 19

A Quantitative Review on Language Model Efficiency Research

Paper • 2306.01768 • Published May 28, 2023 • 2

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 32