Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
```bash
|
3 |
+
git clone https://github.com/NVIDIA/TensorRT-LLM.git
|
4 |
+
|
5 |
+
python ./TensorRT-LLM/examples/run.py --engine_dir=llama3engine_bf16_1gpu \
|
6 |
+
--max_output_len 5 \
|
7 |
+
--tokenizer_dir llama3-hf \
|
8 |
+
--input_text "How do I count to nine in French?" \
|
9 |
+
--run_profiling
|
10 |
+
|
11 |
+
2024-04-25 19:35:59.062455: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
|
12 |
+
Input [Text 0]: "<|begin_of_text|>How do I count to nine in French?"
|
13 |
+
Output [Text 0 Beam 0]: " Counting in French is"
|
14 |
+
batch_size: 1, avg latency of 10 iterations: : 0.0999948501586914 sec
|
15 |
+
```
|