hazyresearch
/

based-1b-50b

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

simarora commited on May 5

Commit

e5f50a6

•

1 Parent(s): 68d9094

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -8,9 +8,9 @@ language:
 This model is pretrained Based model.
-As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b-50b
-Both checkpoints are pretrained on **50Bn tokens** of the Pile in the exact same data order using next token prediction.
 A WandB report for training is here: https://api.wandb.ai/links/hazy-research/ggo9rst2

 This model is pretrained Based model.
+As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b-50b and a pretrained attention (Llama architecture) model provided here: https://huggingface.co/hazyresearch/attn-1b-50bn
+All three checkpoints are pretrained on **50Bn tokens** of the Pile in the exact same data order using next token prediction.
 A WandB report for training is here: https://api.wandb.ai/links/hazy-research/ggo9rst2