README / README.md
rdiehlmartinez's picture
Update README.md
e948f36 verified
|
raw
history blame
1.63 kB
metadata
title: README
emoji: 🎯
colorFrom: red
colorTo: yellow
sdk: static
pinned: false

🎯 Pico: Tiny Language Models for Learning Dynamics Research

Pico is a framework for training and analyzing small language models, designed with clarity and educational purposes in mind. Built on a LLAMA-style architecture, Pico makes it easy to experiment with and understand transformer-based language models.

πŸ”‘ Key Features

  • Simple Architecture: Clean, modular implementation of core transformer components
  • Educational Focus: Well-documented code with clear references to academic papers
  • Research Ready: Built-in tools for analyzing model learning dynamics
  • Efficient Training: Pre-tokenized dataset and optimized training loop
  • Modern Stack: Built with PyTorch Lightning, Wandb, and HuggingFace integrations

πŸ—οΈ Core Components

  • RMSNorm for stable layer normalization
  • Rotary Positional Embeddings (RoPE) for position encoding
  • Multi-head attention with KV-cache support
  • SwiGLU activation function
  • Residual connections throughout

πŸ“š References

Our implementation draws inspiration from and builds upon:

🀝 Contributing

We welcome contributions! Whether it's:

  • Adding new features
  • Improving documentation
  • Fixing bugs
  • Sharing experimental results

πŸ“ License

Apache 2.0 License

πŸ“« Contact