jartine commited on
Commit
d003a5a
1 Parent(s): f3d52af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -10
README.md CHANGED
@@ -1,13 +1,4 @@
1
  ---
2
- language:
3
- - en
4
- - de
5
- - fr
6
- - it
7
- - pt
8
- - hi
9
- - es
10
- - th
11
  tags:
12
  - llamafile
13
  - facebook
@@ -74,7 +65,10 @@ This model has a max context window size of 128k tokens. By default, a
74
  context window size of 4096 tokens is used. You can use a larger context
75
  window by passing the `-c 8192` flag. The software currently has
76
  limitations in its llama v3.1 support that may prevent scaling to the
77
- full 128k size.
 
 
 
78
 
79
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
80
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
 
1
  ---
 
 
 
 
 
 
 
 
 
2
  tags:
3
  - llamafile
4
  - facebook
 
65
  context window size of 4096 tokens is used. You can use a larger context
66
  window by passing the `-c 8192` flag. The software currently has
67
  limitations in its llama v3.1 support that may prevent scaling to the
68
+ full 128k size. See our
69
+ [Phi-3-medium-128k-instruct-llamafile](https://huggingface.co/Mozilla/Phi-3-medium-128k-instruct-llamafile)
70
+ repository for llamafiles that are known to work with a 128kb context
71
+ size.
72
 
73
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
74
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card