Uma avaliação sobre características espaço-temporais baseadas em invariantes de cor para reconhecimento de ações

Autor Principal: Fillipe Dias Moreira de Souza
Tipo: Teses/dissertações
Idioma: eng
Publicado em: IBICT 20110613
Link Texto Completo:
Saved in:
Local spatiotemporal feature has been proved to be a powerful tool to represent latent patterns of moving objects in video scenes.

In particular, recognition of human actions has been the principal focus for various growing applications, including video indexing, content-based video retrieval, video summarization, filtering of unwanted content, rating of movies, to name a few.

In general, spatiotemporal interest point detectors rely solely on gray-scale values and, in addition to this, descriptions of the support regions are mostly based on histograms of gradient orientation (to infer shape description) and optic flow (to estimate motion appearance).

On the other hand, color information seems to have been overlooked during the last years of ameliorations of techniques for detection and description of local features in the space-time domain, despite being usually considered an important element to understand events from our surroundings.

For object and scene recognition in static images, robustness to photometric variations has been achieved by describing local regions of spatial interest points in terms of color invariance properties.

In such approach, robustness to lighting geometry, illumination intensity and highlight was built on the well-known dichromatic reflection model.In this context, the present work holds three main contributions.

First, we have extended the space-time corner detector (STIP) to incorporate color information (using the normalized-RGB color system) at the detection phase, which we have called the ColorSTIP.

Secondly, we have considered the use of color histograms (based on the saturation-weighed hue channel) to describe support regions of spatiotemporal interest points, calling it HueSTIP.

Finally, it was conducted a thorough analysis of performance of the proposed extensions for the human action recognition in videos of unconstrained scenarios.