license: cc-by-nc-nd-4.0

Introduction

TQCompressedGPT-2 is an advanced neural network model, offering a novel method for model compression through improved tensor decompositions. It addresses the challenges of computational and storage demands in NLP tasks, introducing a permutation-based enhancement to Kronecker decomposition, significantly reducing model size while maintaining performance.
TQCompressedGPT2 © 2024 by Terra Quantum AG is licensed under CC BY-NC-ND 4.0. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/
Any entity who wishes to use this library for commercial purposes should contact info@terraquantum.swiss for more information.
License: CC BY-NC-ND 4.0

Features

Model Size Reduction: Compresses the GPT-2small model from 124 million to 81 million parameters.
Permutation-Based Enhancement: Introduces a new permutation algorithm for matrix factorization, minimizing performance degradation.
Efficient Training Strategy: Employs multi-step knowledge distillation with a fraction (3.1%) of the OpenWebText dataset.
Performance: Outperforms DistilGPT-2 in comparative evaluations.

Permutation-Based Enhancement

In our work we employ permutation-based algorithm, which allows to achieve better decomposition approximation for weight matrices:

Methodology

For more details about the techniques of TQCompressedGPT-2, refer to our paper: (ADD LINK)TQCompressor: Improving Tensor Decomposition in Neural Networks via Permutations
TQCompressed Decomposition: Focuses on optimal permutation of weight matrices followed by Kronecker decomposition.
Knowledge Distillation: Uses an iterative compression method coupled with knowledge distillation, enhancing performance.
Application: Demonstrated on the GPT-2 model, showing its versatility and applicability to various neural network architectures.

Usage

The model and code are publicly available at:

Citation

If you find TQCompressedGPT-2 useful in your research, please cite the following paper:

@article{tqcompressedgpt2,
  title={TQCompressor: Improving Tensor Decomposition in Neural Networks via Permutations},
  author={Abronin, V., Naumov, A., Mazur, D., Bystrov, D., Tsarova, K., Melnikov, Ar., Oseledets, I., Dolgov, S., Brasher, R., Perelshtein, M.},
  journal={arXiv preprint arXiv:[insert_arxiv_id]},
  year={2023}
}

Acknowledgments

  • Terra Quantum AG, Kornhausstrasse 25, 9000 St. Gallen, Switzerland
  • Project contributors and researchers.
Downloads last month
84
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.