Papers
arxiv:2401.11626

Freely Long-Thinking Transformer (FraiLT)

Published on Jan 21, 2024
Authors:

Abstract

Freely Long-Thinking Transformer (FraiLT) is an improved transformer model designed to enhance processing capabilities without scaling up size. It utilizes a recursive approach, iterating over a subset of layers multiple times, and introduces iteration encodings to maintain awareness across these cycles. Iteration encoding allows FraiLT to achieve the interpretive depth of larger models in a compact form. When evaluated on a synthetic story dataset, FraiLT outperformed larger models, showcasing its ability to deliver high-quality performance while reducing memory demands. This model represents a step forward towards more efficient and accessible language models.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2401.11626 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.11626 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2401.11626 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.