Point tracking with CoTracker
CoTracker is a fast transformer-based model that was introduced in CoTracker: It is Better to Track Together. It can track any point in a video and brings to tracking some of the benefits of Optical Flow.
CoTracker can track:
- Any pixel in a video
- A quasi-dense set of pixels together
- Points can be manually selected or sampled on a grid in any video frame
How to use
Here is how to use this model in the offline mode:
pip install imageio[ffmpeg]
, then:
import torch
# Download the video
url = 'https://github.com/facebookresearch/co-tracker/blob/main/assets/apple.mp4'
import imageio.v3 as iio
frames = iio.imread(url, plugin="FFMPEG") # plugin="pyav"
device = 'cuda'
grid_size = 10
video = torch.tensor(frames).permute(0, 3, 1, 2)[None].float().to(device) # B T C H W
# Run Offline CoTracker:
cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker2").to(device)
pred_tracks, pred_visibility = cotracker(video, grid_size=grid_size) # B T N 2, B T N 1
and in the online mode:
cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker2_online").to(device)
# Run Online CoTracker, the same model with a different API:
# Initialize online processing
cotracker(video_chunk=video, is_first_step=True, grid_size=grid_size)
# Process the video
for ind in range(0, video.shape[1] - cotracker.step, cotracker.step):
pred_tracks, pred_visibility = cotracker(
video_chunk=video[:, ind : ind + cotracker.step * 2]
) # B T N 2, B T N 1
Online processing is more memory-efficient and allows for the processing of longer videos or videos in real-time.
BibTeX entry and citation info
@article{karaev2023cotracker,
title={CoTracker: It is Better to Track Together},
author={Nikita Karaev and Ignacio Rocco and Benjamin Graham and Natalia Neverova and Andrea Vedaldi and Christian Rupprecht},
journal={arXiv:2307.07635},
year={2023}
}
- Downloads last month
- 903