mosaicml
/

mosaic-bert-base

Model card Files Files and versions Community

mosaic-bert-base

4 contributors

History: 28 commits

jacobfulano's picture

Update detail about Triton Flash Attention with ALiBi implementation

8a9076d 10 months ago

.gitattributes

1.48 kB

initial commit over 1 year ago
README.md

14.6 kB

Update detail about Triton Flash Attention with ALiBi implementation 10 months ago
bert_layers.py

47.3 kB

Upload BertForMaskedLM over 1 year ago
bert_padding.py

6.26 kB

Upload BertForMaskedLM over 1 year ago
config.json

827 Bytes

Change attention_probs_dropout_prob to 0.1 so that FlashAttention/triton dependencies are avoided 10 months ago
configuration_bert.py

1.02 kB

Upload BertForMaskedLM over 1 year ago
flash_attn_triton.py

42.7 kB

Upload BertForMaskedLM over 1 year ago
generation_config.json

90 Bytes

Upload BertForMaskedLM over 1 year ago
pytorch_model.bin
Detected Pickle imports (3)
- "torch._utils._rebuild_tensor_v2",
- "torch.FloatStorage",
- "collections.OrderedDict"
What is a pickle import?
550 MB
LFS

Upload BertForMaskedLM over 1 year ago