Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -148,7 +148,7 @@ We will automatically find a batch size that fits in your GPU memory. The defaul
|
|
148 |
|
149 |
### Loading Huge Models
|
150 |
|
151 |
-
Huge models such as LLaMA 65B or nllb-moe-54b can be
|
152 |
See [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes). Set precision to 8 or 4 with the `--precision` flag.
|
153 |
|
154 |
```bash
|
|
|
148 |
|
149 |
### Loading Huge Models
|
150 |
|
151 |
+
Huge models such as LLaMA 65B or nllb-moe-54b can be loaded in a single GPU with 8 bits and 4 bits quantification with minimal performance degradation.
|
152 |
See [BitsAndBytes](https://github.com/TimDettmers/bitsandbytes). Set precision to 8 or 4 with the `--precision` flag.
|
153 |
|
154 |
```bash
|