jamba-H1024_L12-v0.07-fineweb-1M-med
mid-training checkpoint
- arch: jamba (see model card for kernels/use)
- tokenizer: claude3 as HF GPT2
- has only seen up to 2048 context length thus far
numbers
for this checkpoint
hf (pretrained=pszemraj/jamba-H1024_L12-v0.07-fineweb-1M-med,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
winogrande | 1 | none | 0 | acc | 0.4972 | ± | 0.0141 |
piqa | 1 | none | 0 | acc | 0.6072 | ± | 0.0114 |
none | 0 | acc_norm | 0.6034 | ± | 0.0114 | ||
openbookqa | 1 | none | 0 | acc | 0.1660 | ± | 0.0167 |
none | 0 | acc_norm | 0.2800 | ± | 0.0201 | ||
lambada_openai | 1 | none | 0 | perplexity | 157.6757 | ± | 6.8536 |
none | 0 | acc | 0.2127 | ± | 0.0057 | ||
boolq | 2 | none | 0 | acc | 0.6235 | ± | 0.0085 |
arc_easy | 1 | none | 0 | acc | 0.3944 | ± | 0.0100 |
none | 0 | acc_norm | 0.3531 | ± | 0.0098 |
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.