Extracting Moving People and Categorizing Their Activities in Video
Speaker: Juan Carlos Niebles Duque
Series: Final Public Orals
Location:
Engineering Quadrangle B327
Date/Time: Tuesday, December 14, 2010, 2:00 p.m.
- 4:00 p.m.
Abstract:
Detecting and categorizing human motion in unconstrained video sequences is an important problem in computer vision, potentially impacting a large variety of applications such as video search and indexing, smart surveillance systems, video game interfaces, etc. In this talk, we focus on two questions: where are the moving humans in a video sequence? and what actions or activities are they performing? We have proposed a number of statistical models for human action recognition based in spatial and spatio-temporal local features: an unsupervised bag-of-words model for simple action recognition, a constellation-of-bags-of-features hierarchical model for recognizing simple actions and a discriminative framework for modeling temporal composition of simple motions into complex activities. In the second part of the talk, we present a fully automatic framework to detect and extract arbitrary human motion volumes from challenging real-world videos collected from YouTube. Our method carefully combines bottom-up and top-down cues which enables fast extraction in near real time.

