nisten
/

Biggie-SmoLlm-0.15B-Base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nisten commited on Aug 6, 2024

Commit

ee1406d

•

1 Parent(s): 61c0c24

Update README.md

Files changed (1) hide show

README.md +12 -3

README.md CHANGED Viewed

@@ -4,10 +4,19 @@ datasets:
 - LDJnr/Capybara
 ---
-###EVEN SMALLER Frankenstein of smolLm-0.13b upped to 0.15b
 Use this frankenbase for training.
-If you're here from twitter and imatient, get the trained checkpoint file.
 ```verilog
 wget https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base/resolve/main/biggie_groked_int8_q8_0.gguf
@@ -31,7 +40,7 @@ We're talking about a convergence of whole bunch of stuff, more papers will be w
 2. **BitNet Integration**:
 4. **Experimental GrokAdamW Optimizer**:
-## Acknodledgements
 Credits for optimizer go to [@cognitivecompai](https://github.com/cognitivecomputations/grokadamw) for laying the groundwork with the original GrokAdamW optimizer.

 - LDJnr/Capybara
 ---
+### EVEN SMALLER Frankenstein of smolLm-0.13b upped to 0.18b
 Use this frankenbase for training.
+Sorry for the mislabelling, the model is a 0.18b 181m parameter, not 0.15.
+I did not except this repo to blow up and now all the training scripts depend on it.
+* ACKOWLEDGE HF PAGE ON YOUR FUTURE PAPERS OR I WILL DRAG YOUR ORG ON TWITTER LIKE I DID WITH COHERE LOL
+>>[!TIP]🐧 If you're here from twitter and imatient, get the trained checkpoint file that runs on 1 cpu core:
+>>
+>>make sure to install latest llama.cpp first, it's easy on linux & mac:
+>> git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make -j
+Now for the magic trained finetune that runs at insane speeds:
 ```verilog
 wget https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base/resolve/main/biggie_groked_int8_q8_0.gguf
 2. **BitNet Integration**:
 4. **Experimental GrokAdamW Optimizer**:
+## Prior work, from last week
 Credits for optimizer go to [@cognitivecompai](https://github.com/cognitivecomputations/grokadamw) for laying the groundwork with the original GrokAdamW optimizer.