Data: c4 and codeparrot, about 1:1 sample-wise but 1:4 token-wise mix. Significantly biased for codes (python, go, java, javascript, c, c++). Params:

  • batch size 64 * 2048 * 8 = 1048576 tokens
  • lr automatically according to EAI sae codebase
  • auxk_alpha 0.03
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.