Learning to Parse Videos
GRAND Seminar Friday, December 05, 11am, Room 4201
Abstract: In this talk, we look into the problem of segmenting, tracking, and extracting 3D time varying shape and camera poses for non-rigid objects in monocular videos. Our method segments and tracks objects and their parts using past segmentation and tracking experience from a training set, and uses the segmented point trajectories of each object to extract 3D shape assuming a low-rank shape prior. We segment using motion boundaries and learnt saliency detection, and outperform by a margin the previous state-of-the-art in challenging video scenes. We ``learn to track’’ using a novel tracking loss in a distance learning framework, and outperform color and texture histograms as well as deep feature learnt from Imagenet Classification or Detection tasks. We extract dense 3D object models from realistic monocular videos, a problem typically studied with lab acquired datasets, pre-segmented objects and oracle trajectories.
About the Speaker
Katerina Fragkiadaki is a Post Doctoral Researcher in the Computer Vision Laboratory at UC Berkeley with Jitendra Malik. She received her Ph.D. in University of Pennsylvania in 2013. She is the co-recipient of best thesis award in the Computer Science Department in UPenn. She has a BA in Electrical and Computer Engineering from National Technical University of Athens. She works on problems related to video understanding, tracking, video segmentation.