malteos
/

bloom-6b4-clp-german

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

malteos commited on Jan 24, 2023

Commit

c0c608e

•

1 Parent(s): 8a80532

Create README.md

Files changed (1) hide show

README.md +34 -0

README.md ADDED Viewed

	@@ -0,0 +1,34 @@

+---
+license: bigscience-bloom-rail-1.0
+datasets:
+- oscar
+language:
+- de
+library_name: transformers
+pipeline_tag: text-generation
+---
+# BLOOM-CLP German (6.4B parameters)
+This is a monolingual German language model trained using the [CLP-Transfer](https://arxiv.org/abs/2301.09626) method based on [BLOOM-7b1](https://huggingface.co/bigscience/bloom-7b1).
+You can try out the model at [European Language Grid](https://live.european-language-grid.eu/catalogue/tool-service/20825/try%20out/).
+## Training dataset
+- ca. 50B German tokens
+- Web-crawled content from the German subset [OSCAR v22.01](https://oscar-corpus.com/post/oscar-v22-01/) (excluding content tagged as header, footer, noisy, or adult)
+- Web-crawled content from the [GC4 Corpus](https://german-nlp-group.github.io/projects/gc4-corpus.html) (including only the head and middle parts)
+- German court decisions from [Open Legal Data](http://openlegaldata.io/)
+## Code
+- [BigScience's Megatron-Deepspeed fork](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
+## Hardware
+- 32xA100-40GB GPUs
+## Evaluation
+TBA (see paper)