===================================================================================================================
Layer (type:depth-idx) Output Shape Param #
===================================================================================================================
MegaForMaskedLM [4, 2048, 50265] --
├─MegaModel: 1-1 [4, 2048, 768] --
│ └─MegaEmbeddings: 2-1 [4, 2048, 768] --
│ │ └─Embedding: 3-1 [4, 2048, 768] 38,603,520
│ └─ModuleList: 2-2 -- --
│ │ └─MegaBlock: 3-2 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-3 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-4 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-5 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-6 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-7 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-8 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-9 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-10 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-11 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-12 [2048, 4, 768] 6,202,626
│ │ └─MegaBlock: 3-13 [2048, 4, 768] 6,202,626
├─Linear: 1-2 [4, 2048, 50265] 38,653,785
===================================================================================================================
Total params: 151,688,817
Trainable params: 151,688,817
Non-trainable params: 0
Total mult-adds (G): 150.35
===================================================================================================================
Input size (MB): 0.07
Forward/backward pass size (MB): 10818.75
Params size (MB): 606.71
Estimated Total Size (MB): 11425.52
===================================================================================================================
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.