arxiv:2110.11624

SciCap: Generating Captions for Scientific Figures

Published on Oct 22, 2021

Authors:

Abstract

Researchers use figures to communicate rich, complex information in scientific papers. The captions of these figures are critical to conveying effective messages. However, low-quality figure captions commonly occur in scientific articles and may decrease understanding. In this paper, we propose an end-to-end neural framework to automatically generate informative, high-quality captions for scientific figures. To this end, we introduce SCICAP, a large-scale figure-caption dataset based on computer science arXiv papers published between 2010 and 2020. After pre-processing - including figure-type classification, sub-figure identification, text normalization, and caption text selection - SCICAP contained more than two million figures extracted from over 290,000 papers. We then established baseline models that caption graph plots, the dominant (19.2%) figure type. The experimental results showed both opportunities and steep challenges of generating captions for scientific figures.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 164

Browse 164 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2110.11624 in a dataset README.md to link it from this page.

Spaces citing this paper 62

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.