Mozilla
/

TriLM-llamafile

Text Generation

llamafile

English

Model card Files Files and versions Community

jartine commited on Aug 25

Commit

19d64c4

•

1 Parent(s): ddfe71b

Update README.md

Browse files

Files changed (1) hide show

README.md +151 -0

README.md ADDED Viewed

	@@ -0,0 +1,151 @@

+---
+language:
+- en
+model_creator: SpectraSuite
+quantized_by: jartine
+pipeline_tag: text-generation
+license: apache-2.0
+license_link: LICENSE
+tags:
+- llamafile
+---
+# TriLM - llamafile
+This is a ternary LLM whose weights consist of {-1, 0, +1}. It's highly
+optimized for CPU performance, thanks to the [`Q2_K_S` quantization
+format](https://github.com/Mozilla-Ocho/llamafile/pull/552).
+- Model creator: [SpectraSuite](https://huggingface.co/SpectraSuite)
+- Original model: [TriLMs-Unpacked](https://huggingface.co/collections/SpectraSuite/trilms-unpacked-668d5f62afe0f4036925b1d2)
+This repository packages and distributes TriLM as executable weights,
+which we call [llamafiles](https://github.com/Mozilla-Ocho/llamafile).
+The files you download here will run on Linux, MacOS, Windows, FreeBSD,
+OpenBSD, and NetBSD for AMD64 and ARM64.
+## Quickstart
+Running the following on a desktop OS will launch a tab in your web
+browser with a completions interface.
+```
+wget https://huggingface.co/Mozilla/TriLM-llamafile/resolve/main/TriLM_3.9B.llamafile
+chmod +x TriLM_3.9B.llamafile
+./TriLM_3.9B.llamafile
+```
+For further information, please see the [llamafile
+README](https://github.com/mozilla-ocho/llamafile/).
+Having **trouble?** See the ["Gotchas"
+section](https://github.com/mozilla-ocho/llamafile/?tab=readme-ov-file#gotchas-and-troubleshooting)
+of the README.
+## Prompting
+This is a base model. It hasn't been fine-tuned for chat. It's
+recommended that the completions interface be used.
+It's recommended with the smaller TriLM models (e.g. 99M) that a high
+repeat penalty be set, e.g. `--repeat-penalty 10`.
+## Benchmarks
+| cpu\_info                                      | model\_filename                          | size       | test          | t/s             |
+| -----------------------------------------:     | ---------------------------------------: | ---------: | ------------: | --------------: |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_3.9B.llamafile                    | 1.31 GiB   | pp512         | 1069.54         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_3.9B.llamafile                    | 1.31 GiB   | tg16          | 88.47           |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_2.4B.llamafile                    | 837.02 MiB | pp512         | 1441.04         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_2.4B.llamafile                    | 837.02 MiB | tg16          | 110.80          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_1.5B.llamafile                    | 531.44 MiB | pp512         | 2185.94         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_1.5B.llamafile                    | 531.44 MiB | tg16          | 154.59          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_1.1B.llamafile                    | 408.66 MiB | pp512         | 2692.87         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_1.1B.llamafile                    | 408.66 MiB | tg16          | 173.08          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_830M.llamafile                    | 301.76 MiB | pp512         | 3353.51         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_830M.llamafile                    | 301.76 MiB | tg16          | 191.98          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_560M.llamafile                    | 211.21 MiB | pp512         | 4297.08         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_560M.llamafile                    | 211.21 MiB | tg16          | 209.57          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_390M.llamafile                    | 148.93 MiB | pp512         | 5130.90         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_390M.llamafile                    | 148.93 MiB | tg16          | 221.88          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_99M.llamafile                     | 148.93 MiB | pp512         | 5127.00         |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_99M.llamafile                     | 148.93 MiB | tg16          | 218.93          |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_190M.llamafile                    | 78.55 MiB  | pp512         | 10874.11        |
+| AMD Ryzen Threadripper PRO 7995WX (znver4)     | TriLM\_190M.llamafile                    | 78.55 MiB  | tg16          | 334.45          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_3.9B.llamafile                    | 1.31 GiB   | pp512         | 227.95          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_3.9B.llamafile                    | 1.31 GiB   | tg16          | 65.17           |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_2.4B.llamafile                    | 837.02 MiB | pp512         | 347.93          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_2.4B.llamafile                    | 837.02 MiB | tg16          | 48.26           |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_1.5B.llamafile                    | 531.44 MiB | pp512         | 588.86          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_1.5B.llamafile                    | 531.44 MiB | tg16          | 140.22          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_1.1B.llamafile                    | 408.66 MiB | pp512         | 767.47          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_1.1B.llamafile                    | 408.66 MiB | tg16          | 167.80          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_830M.llamafile                    | 301.76 MiB | pp512         | 1031.20         |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_830M.llamafile                    | 301.76 MiB | tg16          | 204.46          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_560M.llamafile                    | 211.21 MiB | pp512         | 1487.29         |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_560M.llamafile                    | 211.21 MiB | tg16          | 245.53          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_390M.llamafile                    | 148.93 MiB | pp512         | 2049.02         |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_390M.llamafile                    | 148.93 MiB | tg16          | 332.24          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_99M.llamafile                     | 148.93 MiB | pp512         | 2103.34         |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_99M.llamafile                     | 148.93 MiB | tg16          | 301.31          |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_190M.llamafile                    | 78.55 MiB  | pp512         | 4762.49         |
+| Apple M2 Ultra (+fp16+dotprod)                 | TriLM\_190M.llamafile                    | 78.55 MiB  | tg16          | 553.83          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_3.9B.llamafile                    | 1.31 GiB   | pp512         | 167.15          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_3.9B.llamafile                    | 1.31 GiB   | tg16          | 53.22           |
+| Intel Core i9-14900K (alderlake)               | TriLM\_2.4B.llamafile                    | 837.02 MiB | pp512         | 261.73          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_2.4B.llamafile                    | 837.02 MiB | tg16          | 78.39           |
+| Intel Core i9-14900K (alderlake)               | TriLM\_1.5B.llamafile                    | 531.44 MiB | pp512         | 426.17          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_1.5B.llamafile                    | 531.44 MiB | tg16          | 123.91          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_1.1B.llamafile                    | 408.66 MiB | pp512         | 563.58          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_1.1B.llamafile                    | 408.66 MiB | tg16          | 159.13          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_830M.llamafile                    | 301.76 MiB | pp512         | 763.27          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_830M.llamafile                    | 301.76 MiB | tg16          | 209.42          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_560M.llamafile                    | 211.21 MiB | pp512         | 1116.30         |
+| Intel Core i9-14900K (alderlake)               | TriLM\_560M.llamafile                    | 211.21 MiB | tg16          | 295.71          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_390M.llamafile                    | 148.93 MiB | pp512         | 1586.69         |
+| Intel Core i9-14900K (alderlake)               | TriLM\_390M.llamafile                    | 148.93 MiB | tg16          | 377.50          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_99M.llamafile                     | 148.93 MiB | pp512         | 1587.38         |
+| Intel Core i9-14900K (alderlake)               | TriLM\_99M.llamafile                     | 148.93 MiB | tg16          | 401.37          |
+| Intel Core i9-14900K (alderlake)               | TriLM\_190M.llamafile                    | 78.55 MiB  | pp512         | 3713.16         |
+| Intel Core i9-14900K (alderlake)               | TriLM\_190M.llamafile                    | 78.55 MiB  | tg16          | 845.54          |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_3.9B.llamafile                    | 1.31 GiB   | pp512         | 17.02           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_3.9B.llamafile                    | 1.31 GiB   | tg16          | 6.67            |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_2.4B.llamafile                    | 837.02 MiB | pp512         | 26.35           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_2.4B.llamafile                    | 837.02 MiB | tg16          | 10.52           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_1.5B.llamafile                    | 531.44 MiB | pp512         | 42.52           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_1.5B.llamafile                    | 531.44 MiB | tg16          | 16.91           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_1.1B.llamafile                    | 408.66 MiB | pp512         | 56.57           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_1.1B.llamafile                    | 408.66 MiB | tg16          | 20.54           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_390M.llamafile                    | 148.93 MiB | pp512         | 146.67          |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_390M.llamafile                    | 148.93 MiB | tg16          | 56.77           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_99M.llamafile                     | 148.93 MiB | pp512         | 147.65          |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_99M.llamafile                     | 148.93 MiB | tg16          | 58.24           |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_190M.llamafile                    | 78.55 MiB  | pp512         | 338.42          |
+| Raspberry Pi 5 Model B Rev 1.0 (+fp16+dotprod) | TriLM\_190M.llamafile                    | 78.55 MiB  | tg16          | 107.33          |
+## About llamafile
+llamafile is a new format introduced by Mozilla Ocho on Nov 20th 2023.
+It uses Cosmopolitan Libc to turn LLM weights into runnable llama.cpp
+binaries that run on the stock installs of six OSes for both ARM64 and
+AMD64.
+---
+# TriLM 3.9B Unpacked
+TriLM (ternary model), unpacked to FP16 format - compatible with FP16 GEMMs. After unpacking, TriLM has the same architecture as LLaMa.
+```python
+import transformers as tf, torch
+model_name = "SpectraSuite/TriLM_3.9B_Unpacked"
+# Please adjust the temperature, repetition penalty, top_k, top_p and other sampling parameters according to your needs.
+pipeline = tf.pipeline("text-generation", model=model_id, model_kwargs={"torch_dtype": torch.float16}, device_map="auto")
+# These are base (pretrained) LLMs that are not instruction and chat tuned. You may need to adjust your prompt accordingly.
+pipeline("Once upon a time")
+```
+* License: Apache 2.0
+* We will use our GitHub repo for communication (including HF repo related queries). Feel free to open an issue here https://github.com/NolanoOrg/SpectraSuite