switch-transformer / README.md
bndgyawali's picture
Add model
88c9576
metadata
library_name: keras
tags:
  - switch-transformer

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

name learning_rate decay beta_1 beta_2 epsilon amsgrad training_precision
Adam 0.0010000000474974513 0.0 0.8999999761581421 0.9990000128746033 1e-07 False float32

Model Plot

View Model Plot

Model Image