metadata
library_name: keras
tags:
- switch-transformer
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
name | learning_rate | decay | beta_1 | beta_2 | epsilon | amsgrad | training_precision |
---|---|---|---|---|---|---|---|
Adam | 0.0010000000474974513 | 0.0 | 0.8999999761581421 | 0.9990000128746033 | 1e-07 | False | float32 |