How long did it take you to pre-train and how much compute did it cost ?

by damerajee - opened Apr 26, 2024

Apr 26, 2024

Great Model I've been learning about VLM and multi-model
your code and github repo really help me but before i start pre-training my self I wanted to know how much time did it take and what was the compute cost ?

toshi456

Owner Apr 26, 2024

Using the LLaVA-Pretrain-JA data, the pre-training was completed in about 1 day for the RTX4090 x 1 unit.
LLM used the GPT2-based 1.3b model.

damerajee

May 2, 2024

Oh wow in just 1 day crazy thanks a lot

damerajee changed discussion status to closed May 2, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment