File size: 293 Bytes
93ea02e 3e77faf |
1 2 3 4 5 6 7 |
---
tags:
- llama
---
A randomly initialized checkpoint of a 252M custom transformer architecture with two linear transformations from the llama2-70b embeddings to 1024-dimensional space from 8192-d and then back from 1024-d to 8192-d for the llama2-70b language modelling head.
To be trained |