Update README.md
Browse files
README.md
CHANGED
@@ -42,7 +42,7 @@ You must be a registered user in 🤗 Hugging Face Hub. Please visit [HuggingFac
|
|
42 |
carefully read terms of usage and click accept button. You will need to use an access token for the code below to run. For more information
|
43 |
on access tokens, refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
|
44 |
|
45 |
-
```
|
46 |
pip install optimum-intel[openvino]
|
47 |
|
48 |
optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task text-generation-with-past --weight-format int8 main_model_path
|
@@ -50,7 +50,7 @@ optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task
|
|
50 |
```
|
51 |
|
52 |
3. Download draft model from HuggingFace Hub
|
53 |
-
```
|
54 |
import huggingface_hub as hf_hub
|
55 |
|
56 |
draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M"
|
@@ -59,7 +59,7 @@ draft_model_path = “draft”
|
|
59 |
hf_hub.snapshot_download(draft_model_id, local_dir=draft_model_path)
|
60 |
```
|
61 |
4. Run model inference using the speculative decoding and specify the pipeline parameters:
|
62 |
-
```
|
63 |
import openvino_genai
|
64 |
|
65 |
prompt = “What is OpenVINO?”
|
|
|
42 |
carefully read terms of usage and click accept button. You will need to use an access token for the code below to run. For more information
|
43 |
on access tokens, refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).
|
44 |
|
45 |
+
```bash
|
46 |
pip install optimum-intel[openvino]
|
47 |
|
48 |
optimum-cli export openvino --model meta-llama/Meta-Llama-3.1-8B-Instruct --task text-generation-with-past --weight-format int8 main_model_path
|
|
|
50 |
```
|
51 |
|
52 |
3. Download draft model from HuggingFace Hub
|
53 |
+
```python
|
54 |
import huggingface_hub as hf_hub
|
55 |
|
56 |
draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M"
|
|
|
59 |
hf_hub.snapshot_download(draft_model_id, local_dir=draft_model_path)
|
60 |
```
|
61 |
4. Run model inference using the speculative decoding and specify the pipeline parameters:
|
62 |
+
```python
|
63 |
import openvino_genai
|
64 |
|
65 |
prompt = “What is OpenVINO?”
|