Spaces:
Paused
Paused
Update constants.py
Browse files- constants.py +2 -2
constants.py
CHANGED
@@ -33,8 +33,8 @@ MAX_NEW_TOKENS = CONTEXT_WINDOW_SIZE # int(CONTEXT_WINDOW_SIZE/4)
|
|
33 |
|
34 |
#### If you get a "not enough space in the buffer" error, you should reduce the values below, start with half of the original values and keep halving the value until the error stops appearing
|
35 |
|
36 |
-
N_GPU_LAYERS =
|
37 |
-
N_BATCH =
|
38 |
|
39 |
### From experimenting with the Llama-2-7B-Chat-GGML model on 8GB VRAM, these values work:
|
40 |
# N_GPU_LAYERS = 20
|
|
|
33 |
|
34 |
#### If you get a "not enough space in the buffer" error, you should reduce the values below, start with half of the original values and keep halving the value until the error stops appearing
|
35 |
|
36 |
+
N_GPU_LAYERS = 8 # Llama-2-70B has 83 layers
|
37 |
+
N_BATCH = 16
|
38 |
|
39 |
### From experimenting with the Llama-2-7B-Chat-GGML model on 8GB VRAM, these values work:
|
40 |
# N_GPU_LAYERS = 20
|