TinyLLM

Overview

This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.

Model Information

Parameters: 124M (Hidden Size = 768)
Architecture: Decoder-only transformer
Training Data: Up to 10B tokens from the SHL and Fineweb datasets, combined in a 3:7 ratio
Input and Output Modality: Text
Context Length: 1024

Acknowledgements

We want to acknowledge the open-source frameworks llm.c and llama.cpp and the sensor dataset provided by SHL, which were instrumental in training and testing these models.

Usage

The model can be used in two primary ways:

With Hugging Face’s Transformers Library

from transformers import pipeline
import torch
 
path = "tinyllm/124M-0.3"
prompt = "The sea is blue but it's his red sea"
 
generator = pipeline("text-generation", model=path,max_new_tokens = 30, repetition_penalty=1.3, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto")
print(generator(prompt)[0]['generated_text'])

With llama.cpp Generate a GGUF model file using this tool and use the generated GGUF file for inferencing.
```
python3 convert_hf_to_gguf.py models/mymodel/
```

Disclaimer

This model is intended solely for research purposes.

tinyllm
/

124M-0.3

TinyLLM

Overview

Model Information

Acknowledgements

Usage

Disclaimer

Dataset used to train tinyllm/124M-0.3

Collection including tinyllm/124M-0.3

124M Models