Generate images from text prompts
Generate text by combining an image and a question
Chat with Gemma 2 for text-based conversations
Video captioning/tracking