git-base-One-Piece / README.md
ayoubkirouane's picture
Update README.md
84a5c10
metadata
language: en
license: mit
tags:
  - vision
  - image-to-text
  - image-captioning
model_name: microsoft/git-base
pipeline_tag: image-to-text
library_name: transformers
datasets:
  - ayoubkirouane/One-Piece-anime-captions

Model Details

  • Model Name: Git-base-One-Piece
  • Base Model: Microsoft's "git-base" model
  • Model Type: Generative Image-to-Text (GIT)
  • Fine-Tuned On: 'One-Piece-anime-captions' dataset
  • Fine-Tuning Purpose: To generate text captions for images related to the anime series "One Piece."

Model Description

Git-base-One-Piece is a fine-tuned variant of Microsoft's git-base model, specifically trained for the task of generating descriptive text captions for images from the One-Piece-anime-captions dataset.

The dataset consists of 856 {image: caption} pairs, providing a substantial and diverse training corpus for the model.

The model is conditioned on both CLIP image tokens and text tokens and employs a teacher forcing training approach. It predicts the next text token while considering the context provided by the image and previous text tokens.

image/jpeg

Limitations

  • The quality of generated captions may vary depending on the complexity and diversity of images from the One-Piece-anime-captions dataset.
  • The model's output is based on the data it was fine-tuned on, so it may not generalize well to images outside the dataset's domain. Generating highly detailed or contextually accurate captions may still be a challenge.

Usage

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-to-text", model="ayoubkirouane/git-base-One-Piece")

or

# Load model directly
from transformers import AutoProcessor, AutoModelForCausalLM

processor = AutoProcessor.from_pretrained("ayoubkirouane/git-base-One-Piece")
model = AutoModelForCausalLM.from_pretrained("ayoubkirouane/git-base-One-Piece")