What is the minimum Space Hardware to run this (cloned) Space?

#10
by KHCHEUNG-UoSHK - opened

Since this Space is running on ZeroGPU, which allows Spaces to run on multiple GPUs, I would like to know what the minimum Space Hardware is required to run this (cloned) Space as my request to access ZeroGPU has not granted, which I had made the request 1 week ago.
I do not mind to pay to run while I do not want to waste time and money to test out the minimum Space Hardware for running this (cloned) Space.

Hey @merve , just curious, do you know the minimum GPU RAM needed for inference here? ๐Ÿค”
cc: @nielsr

Owner
โ€ข
edited May 9

Hello ๐Ÿ‘‹ I think it's roughly (7B LM and I think less than 1B vision tower and projector) params, running on float16 which you should be able to run on V100 easily, and maybe you can reduce memory constraints during inference (not storage) you could load the model in 8-bit or 4-bit in a T4 @KHCHEUNG-UoSHK @fcakyon sorry for the late response

Owner

Also note that 4/8-bit aren't native to nvidia hardware, so under the hood they cast forth and back from bf16 or float32, which results in slight decrease in latency, but it makes it easier to work with T4

Thank you for your response and for providing this fantastic space!

Sign up or log in to comment