Attentive Sensing for Dynamic Scene Analysis

Supported by: NSERC Idea to Innovation Program

For any visual system there is a tradeoff between field-of-view (FOV) and resolution. A wide FOV is a huge advantage as it allows the system to detect potentially important events coming from many different directions. However a wide FOV comes at the expense of resolution, since photoreceptors must be spread over a wider portion of the visual field.

The human visual system has evolved a bipartite solution to this tradeoff. Central vision is served by roughly five million photoreceptive cones that provide high resolution sensation over a five degree FOV, while roughly one hundred million rods provide relatively low resolution vision over the remainder of the visual field. The effective resolution is extended by fast gaze-shifting mechanisms and a memory system that allows a form of integration over multiple fixations.

The goal of this project is to better understand the human visual attention system and to use this understanding to build useful attentive machine vision systems. Traditional machine vision systems employ a single camera with relatively small FOV. However, applications such as sports videography require a larger FOV to monitor an extended environment, as well as high resolution to resolve important details (e.g., the location of the puck in ice hockey). These two requirements are conflicting and cannot be met optimally with a single camera. Our laboratory has developed attentive sensing technology (Canadian Patent 2386347, US Patent 7130490) that meets these requirements by closely coupling two camera systems with different FOVs.

The system integrates a wide FOV camera providing continuous monitoring of the larger environment with a narrow FOV pan-tilt camera that can be directed to objects of interest using specialized computer vision and control software. The system thus delivers detailed information from critical targets (e.g., human faces) in the full context of the wider scene.

We are now working to design, build, evaluate and refine next-generation commercial prototypes of this technology for emerging applications such as sports videography, distance learning and surveillance from terrestrial and unmanned aerial vehicle platforms.