Improving the inference/classification/prediction speed of this bart-large-mnli model
#15
by
abhijit57
- opened
Hello,
I am working on a text classification research project and I have a dataset of about 500000 rows where each document is of a fairly larger size (70-100 tokens). I tried this model on nvidia v100 32gb GPU for 10 rows and a candidate label size of 804. It took 10 minutes. I cannot reduce the candidate label list size as per the requirements. I also tried codon compiler and numba to improve the inferences speed but not much luck there.
Has anyone have worked on the C++ bart model or have used deepspeed to improve the predictions for this model?
Any leads or help would be greatly appreciated, thank you.
+1, did you find a way?
Use deepspeed