Papers I want to read. - a hazemessam Collection

hazemessam 's Collections

Papers I want to read.

Papers I want to read.

updated Aug 31, 2024

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published Aug 23, 2024 • 24
Heterogeneous Multi-task Learning with Expert Diversity

Paper • 2106.10595 • Published Jun 20, 2021 • 1
Residual Mixture of Experts

Paper • 2204.09636 • Published Apr 20, 2022 • 1
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Paper • 2307.05956 • Published Jul 12, 2023 • 1
Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers

Paper • 2211.11315 • Published Nov 21, 2022 • 1