BEE-spoke-data
/

smol_llama-220M-GQA

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Jan 4

Commit

1bbcda0

•

1 Parent(s): ad2f9c8

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -71,5 +71,15 @@ A small 220M param (total) decoder model. This is the first version of the model
 - GQA (32 heads, 8 key-value), context length 2048
 - train-from-scratch on one GPU :)
 ---

 - GQA (32 heads, 8 key-value), context length 2048
 - train-from-scratch on one GPU :)
+## Links
+Here are some fine-tunes we did, but there are many more possibilities out there!
+- instruct
+  - openhermes - [link](https://huggingface.co/BEE-spoke-data/smol_llama-220M-openhermes)
+  - open-instruct - [link](https://huggingface.co/BEE-spoke-data/smol_llama-220M-open_instruct)
+- code
+  - python (pypi) - WIP
 ---