daedalus314
commited on
Commit
•
3a2c5e7
1
Parent(s):
262f2c3
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ This model is a quantized version of [Marx-3B-V2](https://huggingface.co/acrastt
|
|
16 |
# Usage
|
17 |
The model has been quantized as part of the project [GPTStonks](https://github.com/GPTStonks). It works with `transformers>=4.33.0` and it can run on a consumer GPU, with less than 3GB of GPU RAM. The libraries `optimum`, `auto-gptq`, `peft` and `accelerate` should also be installed.
|
18 |
|
19 |
-
Here is a sample code to load the model and run inference with it using
|
20 |
```python
|
21 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
22 |
import torch
|
|
|
16 |
# Usage
|
17 |
The model has been quantized as part of the project [GPTStonks](https://github.com/GPTStonks). It works with `transformers>=4.33.0` and it can run on a consumer GPU, with less than 3GB of GPU RAM. The libraries `optimum`, `auto-gptq`, `peft` and `accelerate` should also be installed.
|
18 |
|
19 |
+
Here is a sample code to load the model and run inference with it using greedy decoding:
|
20 |
```python
|
21 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
22 |
import torch
|