Transformers
English
falcon
TheBloke commited on
Commit
bd282c1
1 Parent(s): b58cc00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -54,9 +54,9 @@ Once compiled you can then use `bin/falcon_main` just like you would use llama.c
54
  bin/falcon_main -t 8 -ngl 100 -b 1 -m falcon40b-instruct.ggmlv3.q3_K_S.bin -p "What is a falcon?\n### Response:"
55
  ```
56
 
57
- You can specify `-ngl 100` regardles of your VRAM, as it will automatically detect how much VRAM is available can be used.
58
 
59
- Adjust `-t 8` according to what performs best on your system. Do not exceed the number of physical CPU cores you have.
60
 
61
  `-b 1` reduces batch size to 1. This slightly lowers prompt evaluation time, but frees up VRAM to load more of the model on to your GPU. If you find prompt evaluation too slow and have enough spare VRAM, you can remove this parameter.
62
 
 
54
  bin/falcon_main -t 8 -ngl 100 -b 1 -m falcon40b-instruct.ggmlv3.q3_K_S.bin -p "What is a falcon?\n### Response:"
55
  ```
56
 
57
+ You can specify `-ngl 100` regardles of your VRAM, as it will automatically detect how much VRAM is available to be used.
58
 
59
+ Adjust `-t 8` (the number of CPU cores to use) according to what performs best on your system. Do not exceed the number of physical CPU cores you have.
60
 
61
  `-b 1` reduces batch size to 1. This slightly lowers prompt evaluation time, but frees up VRAM to load more of the model on to your GPU. If you find prompt evaluation too slow and have enough spare VRAM, you can remove this parameter.
62