--- tags: - image-to-text - image-captioning license: apache-2.0 widget: - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/savanna.jpg example_title: Savanna - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/football-match.jpg example_title: Football Match - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/airport.jpg example_title: Airport base_model: - distilbert/distilgpt2 --- Variation of https://huggingface.co/tarekziade/distilvit Trained on 270k images from Flickr10k and COCO. Training source code: https://github.com/tarekziade/distilvit Results: - eval_loss: 0.2305169701576233 - eval_rouge1: 39.511 - eval_rouge2: 14.7798 - eval_rougeL: 35.9476 - eval_rougeLsum: 35.9497 - eval_gen_len: 11.695219762592236