-
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time
Paper • 2408.13233 • Published • 21 -
Heterogeneous Multi-task Learning with Expert Diversity
Paper • 2106.10595 • Published • 1 -
Residual Mixture of Experts
Paper • 2204.09636 • Published • 1 -
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Paper • 2307.05956 • Published • 1
Hazem Essam
hazemessam
AI & ML interests
Protein Language Modeling, Natural Language Processing, Generative Adverserial Networks.
Recent Activity
authored
a paper
7 days ago
liked
a model
14 days ago
Etched/oasis-500m
liked
a dataset
about 1 month ago
bloyal/antiberta-pretrain
Organizations
Collections
1
datasets
7
hazemessam/ddg_megadataset
Viewer
•
Updated
•
754k
•
41
hazemessam/ddg
Preview
•
Updated
•
36
hazemessam/abyssal_db
Preview
•
Updated
•
31
hazemessam/prostata
Viewer
•
Updated
•
10.5k
•
32
hazemessam/fireprot_db
Viewer
•
Updated
•
53.4k
•
33
hazemessam/uniprot_sprot
Viewer
•
Updated
•
569k
•
43
hazemessam/squad_v2
Viewer
•
Updated
•
2
•
59
•
1