Improving Optical Flow on a Pyramid Level

European Conf. on Computer Vision (ECCV) 2020 / August, 2020

By Markus Hofinger, Samuel Rota Bulò, Lorenzo Porzi, Arno Knapitsch, Thomas Pock, Peter Kontschieder

Research paper bib Supplementary material

Abstract

In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a warping- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on underperforming predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting during finetuning and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.

mapillary.com

Improving Optical Flow on a Pyramid Level

Abstract

Publications

CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization

Improving Panoptic Segmentation at All Scales

Mapillary Planet-Scale Depth Dataset

Improving Optical Flow on a Pyramid Level

Towards Generalization Across Depth for Monocular 3D Object Detection

The Mapillary Traffic Sign Dataset for Detection and Classification on a Global Scale

Modeling the Background for Incremental Learning in Semantic Segmentation

Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition

Learning Multi-Object Tracking and Segmentation from Automatic Annotations

Disentangling Monocular 3D Object Detection

Seamless Scene Segmentation

AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs

Unsupervised Domain Adaptation using Feature-Whitening and Consensus Loss

Deep Single Image Camera Calibration with Radial Distortion

In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

Boosting Domain Adaptation by Discovering Latent Domains

Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View

The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

AutoDIAL: Automatic DomaIn Alignment Layers

Loss Max-Pooling for Semantic Image Segmentation

Online Learning with Bayesian Classification Trees

Dropout Distillation