Papers
arxiv:2308.06954

Global Features are All You Need for Image Retrieval and Reranking

Published on Aug 14, 2023
Authors:
,
,
,
,

Abstract

Image retrieval systems conventionally use a two-stage paradigm, leveraging global features for initial retrieval and local features for reranking. However, the scalability of this method is often limited due to the significant storage and computation cost incurred by local feature matching in the reranking stage. In this paper, we present SuperGlobal, a novel approach that exclusively employs global features for both stages, improving efficiency without sacrificing accuracy. SuperGlobal introduces key enhancements to the retrieval system, specifically focusing on the global feature extraction and reranking processes. For extraction, we identify sub-optimal performance when the widely-used ArcFace loss and Generalized Mean (GeM) pooling methods are combined and propose several new modules to improve GeM pooling. In the reranking stage, we introduce a novel method to update the global features of the query and top-ranked images by only considering feature refinement with a small set of images, thus being very compute and memory efficient. Our experiments demonstrate substantial improvements compared to the state of the art in standard benchmarks. Notably, on the Revisited Oxford+1M Hard dataset, our single-stage results improve by 7.1%, while our two-stage gain reaches 3.7% with a strong 64,865x speedup. Our two-stage system surpasses the current single-stage state-of-the-art by 16.3%, offering a scalable, accurate alternative for high-performing image retrieval systems with minimal time overhead. Code: https://github.com/ShihaoShao-GH/SuperGlobal.

Community

Introduces SuperGlobal: using global features for both image retrieval and reranking (instead of using LFM for latter - reducing overhead time and increasing throughput); proposes new modules to enhance GeM (generalized mean) pooling with ArcFace (margin-based) loss; re-ranking by global feature refinement on small set of images. GeM across regions and scales; update global descriptors through proposed kNN aggregation and re-rank global descriptors. The trainable GeM pooling p-value deviates away from optimal with margin losses; p = 1 is like same weight for all features, high ‘p’ indicates selection of specific important features. Proposes GeM+ stage for getting optimal ‘p’ value for GeM. Regional-GeM uses Lp-pooling (own ‘pr’ parameter) approach for regional maps (obtained from and added to already used convolution features). Scale-GeM applies GeM pooling to (shifted - for positive-only values) global features from multiple scales (own ‘pms’ aggregation parameter). Query expansion (QE) and Database-side augmentation (DBA) don’t scale for multiple queries and large databases. SuperGlobal reranking: retrieve top-M (typically 1000) database entries for a query, get top-k (about 10) for query and top-M, refine global database descriptors by weighted pooling (similarity score/dot product weight with multiplier factor) among top-K, expanded query descriptor is the max-pooled top-K refined global descriptor (in initial top-M) for the original query (each query now has global descriptor and expanded descriptor). Reranking score is the average of: similarity between query GD and top-M refined DB, and similarity between expanded query descriptor and top-M refined DB; rerank by average similarity score to sort top-M. Used “revisiting Oxford” (ROxford) for estimating ‘p’ (GeM - 4.6), ‘pr’ (regional - 2.5) and ‘pms’ (Multi-scale - actually infinity - MaxPooling) values: grid-search for getting best mAP; CVNet backbone for features. Best results on medium and hard ROxford and RParis (compared to only global, or global + local re-rank); improves ResNet backbone over DELG and CVNet retrieval methods. Contains ablation on GeM+ (getting 'p' by grid-search), Regional, Multi-scale, and ReLU (modified - from CVNet); all are needed for best results. Shows that SuperGlobal reranking can match local reranking approaches (CVNet). Appendix has more re-ranking comparisons (candidates), parameter study, combination with DELG (better than CVNet), and a note on generalization. From Peking and Google.

Links: arxiv (CVNet, QE, DBA), PapersWithCode, GitHub

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2308.06954 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2308.06954 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2308.06954 in a Space README.md to link it from this page.

Collections including this paper 1