⚡ WebGPU Benchmark Results (62.80x speedup)
#74
by
a414166402
- opened
Batch Size | WASM (int8) | WASM (fp16) | WASM (fp32) | WebGPU (fp16) | WebGPU (fp32) |
1 | 400.70 | 513.20 | 467.20 | 17.40 | 19.90 |
2 | 796.50 | 1013.90 | 917.50 | 58.00 | 36.00 |
4 | 1552.70 | 2013.80 | 1849.20 | 56.10 | 65.90 |
8 | 3176.20 | 4164.10 | 3827.00 | 168.20 | 116.70 |
16 | 6805.90 | 8905.30 | 8092.00 | 258.60 | 145.70 |
32 | 14272.10 | 18282.50 | 16072.30 | 477.00 | 291.10 |
- Model: Xenova/all-MiniLM-L6-v2
- Tests run: WASM (int8), WASM (fp16), WASM (fp32), WebGPU (fp16), WebGPU (fp32)
- Sequence length: 512
- Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36
- GPU: vendor=nvidia, architecture=turing, device=, description=