Spaces:
Running
on
Zero
Running
on
Zero
how to run inference in fp8?
#38
by
codewithRiz
- opened
we have tested on A100 as well and rtx 4070 locally
with flask API. now the on avg 1min video taking 1.5 min infrance time even I tried to upload the video on server already to put just video id so that uploading time reduces .i tiread torch compile also , not sure how to increase inference time .
any tips