Update README.md
Browse files
README.md
CHANGED
@@ -1,13 +1,4 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- en
|
4 |
-
- de
|
5 |
-
- fr
|
6 |
-
- it
|
7 |
-
- pt
|
8 |
-
- hi
|
9 |
-
- es
|
10 |
-
- th
|
11 |
tags:
|
12 |
- llamafile
|
13 |
- facebook
|
@@ -74,7 +65,10 @@ This model has a max context window size of 128k tokens. By default, a
|
|
74 |
context window size of 4096 tokens is used. You can use a larger context
|
75 |
window by passing the `-c 8192` flag. The software currently has
|
76 |
limitations in its llama v3.1 support that may prevent scaling to the
|
77 |
-
full 128k size.
|
|
|
|
|
|
|
78 |
|
79 |
On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
|
80 |
the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
|
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
tags:
|
3 |
- llamafile
|
4 |
- facebook
|
|
|
65 |
context window size of 4096 tokens is used. You can use a larger context
|
66 |
window by passing the `-c 8192` flag. The software currently has
|
67 |
limitations in its llama v3.1 support that may prevent scaling to the
|
68 |
+
full 128k size. See our
|
69 |
+
[Phi-3-medium-128k-instruct-llamafile](https://huggingface.co/Mozilla/Phi-3-medium-128k-instruct-llamafile)
|
70 |
+
repository for llamafiles that are known to work with a 128kb context
|
71 |
+
size.
|
72 |
|
73 |
On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
|
74 |
the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
|