jartine commited on
Commit
f249b3d
·
verified ·
1 Parent(s): 44a894d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -116,6 +116,13 @@ driver needs to be installed if you own an NVIDIA GPU. On Windows, if
116
  you have an AMD GPU, you should install the ROCm SDK v6.1 and then pass
117
  the flags `--recompile --gpu amd` the first time you run your llamafile.
118
 
 
 
 
 
 
 
 
119
  For further information, please see the [llamafile
120
  README](https://github.com/mozilla-ocho/llamafile/).
121
 
 
116
  you have an AMD GPU, you should install the ROCm SDK v6.1 and then pass
117
  the flags `--recompile --gpu amd` the first time you run your llamafile.
118
 
119
+ On NVIDIA GPUs, by default, the prebuilt tinyBLAS library is used to
120
+ perform matrix multiplications. This is open source software, but it
121
+ doesn't go as fast as closed source cuBLAS. If you have the CUDA SDK
122
+ installed on your system, then you can pass the `--recompile` flag to
123
+ build a GGML CUDA library just for your system that uses cuBLAS. This
124
+ ensures you get maximum performance.
125
+
126
  For further information, please see the [llamafile
127
  README](https://github.com/mozilla-ocho/llamafile/).
128