Disentangling Architecture and Training for Optical Flow

Sun, Deqing; Herrmann, Charles; Reda, Fitsum; Rubinstein, Michael; Fleet, David; Freeman, William T.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.10712 (cs)

[Submitted on 21 Mar 2022 (v1), last revised 19 Sep 2022 (this version, v3)]

Title:Disentangling Architecture and Training for Optical Flow

Authors:Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David Fleet, William T. Freeman

View PDF

Abstract:How important are training details and datasets to recent optical flow models like RAFT? And do they generalize? To explore these questions, rather than develop a new model, we revisit three prominent models, PWC-Net, IRR-PWC and RAFT, with a common set of modern training techniques and datasets, and observe significant performance gains, demonstrating the importance and generality of these training details. Our newly trained PWC-Net and IRR-PWC models show surprisingly large improvements, up to 30% versus original published results on Sintel and KITTI 2015 benchmarks. They outperform the more recent Flow1D on KITTI 2015 while being 3x faster during inference. Our newly trained RAFT achieves an Fl-all score of 4.31% on KITTI 2015, more accurate than all published optical flow methods at the time of writing. Our results demonstrate the benefits of separating the contributions of models, training techniques and datasets when analyzing performance gains of optical flow methods. Our source code will be publicly available.

Comments:	Accepted to ECCV22. 33 pages, including supplementals. Website at: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.10712 [cs.CV]
	(or arXiv:2203.10712v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.10712

Submission history

From: Charles Herrmann [view email]
[v1] Mon, 21 Mar 2022 03:15:18 UTC (39,569 KB)
[v2] Fri, 2 Sep 2022 21:46:59 UTC (20,580 KB)
[v3] Mon, 19 Sep 2022 20:41:23 UTC (20,580 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Disentangling Architecture and Training for Optical Flow

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Disentangling Architecture and Training for Optical Flow

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators