HuYaLM 100B

Hugging Face YaLM 100B (by BlackSamorez) is a transformers-compatible implementation of the YaLM 100B model. Originally trained by Yandex, the model used 800 A100 graphics cards and 1.7 TB of diverse text data, including online texts and books, in both English and Russian.

The motivation behind this particular implementation is to update the originally published, outdated code to align with the latest advancements in the field. As this code is compatible with the transformers library, it inherently supports crucial features like quantization (for model size optimization) and adapter training (for efficient fine-tuning).

For more details on training, acceleration, and stabilization techniques, you can refer to articles on Medium (in English) and Habr (in Russian). The original code from Yandex is available on GitHub.

This code and model are distributed under the Apache 2.0 license, which allows for commercial use.