facebook/bart-large-mnli · Improving the inference/classification/prediction speed of this bart-large-mnli model

Apr 27, 2023

Hello,

I am working on a text classification research project and I have a dataset of about 500000 rows where each document is of a fairly larger size (70-100 tokens). I tried this model on nvidia v100 32gb GPU for 10 rows and a candidate label size of 804. It took 10 minutes. I cannot reduce the candidate label list size as per the requirements. I also tried codon compiler and numba to improve the inferences speed but not much luck there.

Has anyone have worked on the C++ bart model or have used deepspeed to improve the predictions for this model?
Any leads or help would be greatly appreciated, thank you.

manbeast3b

Nov 26, 2023

+1, did you find a way?

abhijit57

Nov 26, 2023

Use deepspeed