CarperAI
/

pythia-2.8b-deduped-4k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jon-tow commited on May 2, 2023

Commit

2724cd4

•

1 Parent(s): 49c5af4

Create README.md

Files changed (1) hide show

README.md +13 -0

README.md ADDED Viewed

	@@ -0,0 +1,13 @@

+---
+license: apache-2.0
+datasets:
+- EleutherAI/the_pile_deduplicated
+language:
+- en
+---
+Pythia-2.8B Deduped 4K is a [Pythia-2.8B Deduped](https://huggingface.co/EleutherAI/pythia-2.8b-deduped) model fine-tuned with a 4096 context length.
+Training resumed from their 143,000 step checkpoint and continued on The Pile v1 Deduped (threshold=0.87).
+This particular model is from a checkpoint captured at step 175,500 for an extra 134,217,728,000 tokens of training.
+Note: Sequence length warmup was not used to move up from 2048 but, in hindsight, should have been applied.