jartine commited on
Commit
9d3fb81
1 Parent(s): eb8b832

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -70,6 +70,16 @@ full 128k size. See our
70
  repository for llamafiles that are known to work with a 128kb context
71
  size.
72
 
 
 
 
 
 
 
 
 
 
 
73
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
74
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
75
  driver needs to be installed. If the prebuilt DSOs should fail, the CUDA
 
70
  repository for llamafiles that are known to work with a 128kb context
71
  size.
72
 
73
+ On Windows there's a 4GB limit on executable sizes. You can work around
74
+ that by downloading the [official llamafile
75
+ release](https://github.com/Mozilla-Ocho/llamafile/releases) binary,
76
+ renaming it to have a .exe extension, and then passing the llamafiles in
77
+ this repo via the `-m` flag as though they were GGUF weights, e.g.
78
+
79
+ ```
80
+ .\llamafile-0.8.11.exe -m Meta-Llama-3.1-405B.Q2_K.llamafile
81
+ ```
82
+
83
  On GPUs with sufficient RAM, the `-ngl 999` flag may be passed to use
84
  the system's NVIDIA or AMD GPU(s). On Windows, only the graphics card
85
  driver needs to be installed. If the prebuilt DSOs should fail, the CUDA