arxiv:2408.07500

Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach

Published on Aug 14, 2024

Authors:

Abstract

In this paper, we construct a large-scale benchmark dataset for Ground-to-Aerial Video-based person Re-Identification, named G2A-VReID, which comprises 185,907 images and 5,576 tracklets, featuring 2,788 distinct identities. To our knowledge, this is the first dataset for video ReID under Ground-to-Aerial scenarios. G2A-VReID dataset has the following characteristics: 1) Drastic view changes; 2) Large number of annotated identities; 3) Rich outdoor scenarios; 4) Huge difference in resolution. Additionally, we propose a new benchmark approach for cross-platform ReID by transforming the cross-platform visual alignment problem into visual-semantic alignment through vision-language model (i.e., CLIP) and applying a parameter-efficient Video Set-Level-Adapter module to adapt image-based foundation model to video ReID tasks, termed VSLA-<PRE_TAG>CLIP</POST_TAG>. Besides, to further reduce the great discrepancy across the platforms, we also devise the platform-bridge prompts for efficient visual feature alignment. Extensive experiments demonstrate the superiority of the proposed method on all existing video ReID datasets and our proposed G2A-VReID dataset.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2408.07500 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2408.07500 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2408.07500 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.