metadata
license: apache-2.0
pipeline_tag: text-generation
library_name: transformers
language:
- en
- am
- ar
- as
- az
- be
- bg
- bn
- br
- bs
- ca
- cs
- cy
- da
- de
- el
- eo
- es
- et
- eu
- fa
- ff
- fi
- fr
- fy
- ga
- gd
- gl
- gn
- gu
- ha
- he
- hi
- hr
- ht
- hu
- hy
- id
- ig
- is
- it
- ja
- jv
- ka
- kk
- km
- kn
- ko
- ku
- ky
- la
- lg
- li
- ln
- lo
- lt
- lv
- mg
- mk
- ml
- mn
- mr
- ms
- my
- ne
- nl
- 'no'
- ns
- om
- or
- pa
- pl
- ps
- pt
- qu
- rm
- ro
- ru
- sa
- si
- sc
- sd
- sk
- sl
- so
- sq
- sr
- ss
- su
- sv
- sw
- ta
- te
- th
- tl
- tn
- tr
- ug
- uk
- ur
- uz
- vi
- wo
- xh
- yi
- yo
- zu
datasets:
- yahma/alpaca-cleaned
- gbharti/wealth-alpaca_lora
- saillab/taco-datasets
- xu-song/cc100-samples
- ontocord/fineweb-permissive-multilingual-2m
- MuskumPillerum/General-Knowledge
- yirenc/general_knowledge_boolean
- nampdn-ai/tiny-textbooks
- nampdn-ai/tiny-codes
- bigcode/the-stack-smol-xs
- m-a-p/CodeFeedback-Filtered-Instruction
- jtatman/python-code-dataset-500k
- iamtarun/python_code_instructions_18k_alpaca
- HuggingFaceH4/CodeAlpaca_20K
- gair-prox/open-web-math-pro
- rvv-karma/Math-QA
- ajibawa-2023/Maths-College
- microsoft/orca-math-word-problems-200k
- fblgit/simple-math
- SkunkworksAI/reasoning-0.01
- badrex/llm-emoji-dataset
tags:
- litgpt
- litdata
tangled-llama-58m-32k-base-v0.1
A pretrained language model based on the Llama model with about 58M parameters. This model has been trained on 11.4B (11,422,750,857
) tokens from more than 0.8M (796,399
) dataset rows.
This model isn't designed for immediate use but rather for Continued Pretraining and Finetuning on a downstream task. While it can handle a context length of up to 128K (131,072
) tokens, it was pretrained with sequences of 2K (2048
) tokens.
The objective is to streamline the cognitive or reasoning core, eliminating any redundant knowledge from the model.