YAML Metadata Error: "datasets[0]" with value "The Pile" is not valid. If possible, use a dataset id from https://hf.co/datasets.

RWKV-2 430M

Model Description

RWKV-2 430M is a L24-D1024 causal language model trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.

At this moment you have to use my Github code (https://github.com/BlinkDL/RWKV-v2-RNN-Pile) to run it.

ctx_len = 768 n_layer = 24 n_embd = 1024

Final checkpoint: 20220615-10803.pth : Trained on the Pile for 331B tokens.

  • Pile loss 2.349
  • LAMBADA ppl 15.34, acc 42.42%
  • PIQA acc 67.03%
  • SC2016 acc 62.05%
  • Hellaswag acc_norm 38.47%
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Spaces using BlinkDL/rwkv-2-pile-430m 2