Upload folder using huggingface_hub
b585c7f
verified
A newer version of the Gradio SDK is available:
5.9.1
Upgrade
Backend: transformers
For Interactive visualization of the results, save the linked file as html on your machine and open it in a browser.
Model: h2oai/h2ogpt-4096-llama2-7b-chat (transformers)
Number of GPUs: 0
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
CPU |
1215.52 |
1.17546 |
|
8 |
CPU |
1216.98 |
1.17641 |
|
4 |
CPU |
1217.17 |
1.16575 |
|
Number of GPUs: 1
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
31.8619 |
41.9433 |
|
16 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
32.2947 |
40.9252 |
|
16 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
37.1139 |
32.4529 |
|
16 |
1 x NVIDIA RTX A6000 (46068 MiB) |
47.0375 |
29.8526 |
|
16 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
67.9752 |
18.0571 |
|
8 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
114.622 |
9.21246 |
|
8 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
94.1774 |
8.95532 |
|
8 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
181.246 |
7.47991 |
|
8 |
1 x NVIDIA RTX A6000 (46068 MiB) |
148.616 |
6.61984 |
|
8 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
185.146 |
4.35807 |
|
4 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
39.544 |
32.571 |
|
4 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
42.8067 |
32.3408 |
|
4 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
53.3973 |
23.3267 |
|
4 |
1 x NVIDIA RTX A6000 (46068 MiB) |
61.5241 |
22.8456 |
|
4 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
90.5194 |
14.9456 |
|
Number of GPUs: 2
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
32.1395 |
40.3871 |
|
16 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
39.9269 |
32.248 |
|
16 |
2 x NVIDIA RTX A6000 (46068 MiB) |
47.4105 |
28.8472 |
|
16 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
71.4808 |
17.7518 |
|
8 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
94.9813 |
9.03765 |
|
8 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
178.2 |
7.55443 |
|
8 |
2 x NVIDIA RTX A6000 (46068 MiB) |
152.544 |
6.43862 |
|
8 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
186.884 |
4.35012 |
|
4 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
43.235 |
32.0566 |
|
4 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
57.0808 |
22.6791 |
|
4 |
2 x NVIDIA RTX A6000 (46068 MiB) |
64.6442 |
21.972 |
|
4 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
94.5099 |
14.6162 |
|
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
42.3398 |
30.2181 |
|
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
49.089 |
27.7344 |
|
8 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
180.534 |
7.53804 |
|
8 |
4 x NVIDIA RTX A6000 (46068 MiB) |
153.411 |
6.46469 |
|
4 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
58.6287 |
21.9123 |
|
4 |
4 x NVIDIA RTX A6000 (46068 MiB) |
66.4926 |
21.409 |
|
Number of GPUs: 8
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
40.4986 |
30.5489 |
|
8 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
186.713 |
7.23498 |
|
4 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
60.1828 |
21.9172 |
|
Model: h2oai/h2ogpt-4096-llama2-13b-chat (transformers)
Number of GPUs: 1
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
52.4984 |
26.2487 |
|
16 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
49.7972 |
24.9301 |
|
16 |
1 x NVIDIA RTX A6000 (46068 MiB) |
71.9114 |
18.4362 |
|
16 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
16 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
nan |
nan |
OOM |
8 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
168.967 |
7.67522 |
|
8 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
185.442 |
6.0205 |
|
8 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
174.458 |
5.69269 |
|
8 |
1 x NVIDIA RTX A6000 (46068 MiB) |
193.993 |
5.56359 |
|
8 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
280.467 |
3.75936 |
|
4 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
45.3051 |
20.4771 |
|
4 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
68.0646 |
16.1241 |
|
4 |
1 x NVIDIA RTX A6000 (46068 MiB) |
81.1389 |
15.6933 |
|
4 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
74.271 |
15.0868 |
|
4 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
96.6189 |
9.77255 |
|
Number of GPUs: 2
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
51.6428 |
26.1842 |
|
16 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
51.299 |
24.8757 |
|
16 |
2 x NVIDIA RTX A6000 (46068 MiB) |
72.8565 |
18.2039 |
|
16 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
89.5996 |
12.8295 |
|
8 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
167.523 |
7.82793 |
|
8 |
2 x NVIDIA RTX A6000 (46068 MiB) |
195.929 |
5.51238 |
|
8 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
180.781 |
5.43787 |
|
8 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
280.831 |
3.72157 |
|
4 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
47.1425 |
19.9791 |
|
4 |
2 x NVIDIA RTX A6000 (46068 MiB) |
84.5776 |
15.1326 |
|
4 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
79.9461 |
14.3455 |
|
4 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
98.4705 |
9.68779 |
|
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
55.3779 |
21.7073 |
|
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
74.4377 |
17.8537 |
|
8 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
179.505 |
5.45185 |
|
8 |
4 x NVIDIA RTX A6000 (46068 MiB) |
199.799 |
5.39725 |
|
4 |
4 x NVIDIA RTX A6000 (46068 MiB) |
87.6579 |
14.6779 |
|
4 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
78.9061 |
14.6754 |
|
Number of GPUs: 8
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
55.3965 |
22.302 |
|
8 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
185.328 |
5.38647 |
|
4 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
83.0479 |
13.969 |
|
Model: h2oai/h2ogpt-4096-llama2-70b-chat (transformers)
Number of GPUs: 1
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
nan |
nan |
OOM |
16 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
16 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
nan |
nan |
OOM |
16 |
1 x NVIDIA RTX A6000 (46068 MiB) |
nan |
nan |
OOM |
8 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
nan |
nan |
OOM |
8 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
8 |
1 x NVIDIA RTX A6000 (46068 MiB) |
nan |
nan |
OOM |
4 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
122.132 |
10.6495 |
|
4 |
1 x NVIDIA RTX A6000 (46068 MiB) |
165.058 |
6.94248 |
|
4 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
Number of GPUs: 2
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
2 x NVIDIA RTX A6000 (46068 MiB) |
nan |
nan |
OOM |
8 |
2 x NVIDIA RTX A6000 (46068 MiB) |
410.069 |
2.25687 |
|
4 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
120.538 |
10.5008 |
|
4 |
2 x NVIDIA RTX A6000 (46068 MiB) |
171.744 |
6.71342 |
|
4 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
267.056 |
4.24242 |
|
8 |
4 x NVIDIA RTX A6000 (46068 MiB) |
413.957 |
2.22551 |
|
4 |
4 x NVIDIA RTX A6000 (46068 MiB) |
175.491 |
6.5798 |
|
Backend: text-generation-inference
For Interactive visualization of the results, save the linked file as html on your machine and open it in a browser.
Model: h2oai/h2ogpt-4096-llama2-7b-chat (text-generation-inference)
Number of GPUs: 1
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
39.0155 |
55.2139 |
|
16 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
29.129 |
45.9535 |
|
16 |
1 x NVIDIA GeForce RTX 4090 (24564 MiB) |
24.3988 |
44.5878 |
|
16 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
39.2697 |
30.3068 |
|
16 |
1 x NVIDIA RTX A6000 (46068 MiB) |
40.3622 |
29.9724 |
|
Number of GPUs: 2
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
7.63612 |
71.7881 |
|
16 |
2 x NVIDIA RTX A6000 (46068 MiB) |
41.0461 |
30.3726 |
|
16 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
41.0245 |
29.36 |
|
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
42.8377 |
29.388 |
|
16 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
41.0995 |
28.4403 |
|
Number of GPUs: 8
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
42.8594 |
27.8644 |
|
Model: h2oai/h2ogpt-4096-llama2-13b-chat (text-generation-inference)
Number of GPUs: 1
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
1 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
21.7823 |
33.7132 |
|
16 |
1 x NVIDIA A100-SXM4-80GB (81920 MiB) |
51.8428 |
19.083 |
|
16 |
1 x NVIDIA GeForce RTX 3090 (24576 MiB) |
nan |
nan |
OOM |
16 |
1 x NVIDIA RTX A6000 (46068 MiB) |
nan |
nan |
OOM |
Number of GPUs: 2
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
2 x NVIDIA RTX 6000 Ada Generation (49140 MiB) |
10.8242 |
57.8237 |
|
16 |
2 x NVIDIA GeForce RTX 3090 (24576 MiB) |
42.2111 |
31.4247 |
|
16 |
2 x NVIDIA A100-SXM4-80GB (81920 MiB) |
53.3837 |
22.223 |
|
16 |
2 x NVIDIA RTX A6000 (46068 MiB) |
64.782 |
21.3549 |
|
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
52.7912 |
21.3862 |
|
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
66.5247 |
20.777 |
|
Number of GPUs: 8
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
56.3847 |
20.3764 |
|
Model: h2oai/h2ogpt-4096-llama2-70b-chat (text-generation-inference)
Number of GPUs: 4
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
4 x NVIDIA A100-SXM4-80GB (81920 MiB) |
131.453 |
9.61851 |
|
16 |
4 x NVIDIA RTX A6000 (46068 MiB) |
nan |
nan |
OOM |
Number of GPUs: 8
bits |
gpus |
summarization time [sec] |
generation speed [tokens/sec] |
exception |
16 |
8 x NVIDIA A100-SXM4-80GB (81920 MiB) |
133.53 |
9.53011 |
|