Semantic detection and recognition of objects and events contained in a video stream has to be performed in order to provide content-based annotation and retrieval of videos. This annotation is done as a means to be able to reuse the video material at a later stage, e.g. to produce new TV programmes. A typical example is that of sports videos, where videos are annotated in order to reuse the video clips that show key highlights and players to produce short summaries for news and sports programmes. In order to select the most interesting actions among all the possibly detected highlights further analysis is required; i.e. the shots that contain a key action are typically followed by close-ups of the players that take part in the action. Therefore the automatic identification of these players would add considerable value both to the annotation and retrieval of the key highlights and key players of a sport event. The problem of detecting and recognizing faces in broadcast videos is a widely studied topic. However, in the case of soccer videos, and sports videos in general, the current techniques are not suitable for the task of face recognition, due to the high variations in pose, illumination, scale and occlusion that may happen in an uncontrolled environment. In this paper a method that copes with these problems, exploiting local features to describe a face, without requiring a precise localization of the distinguishing parts of a face, and the set of poses to describe a person and perform a more robust recognition, is presented. A similarity metric based on the number of matched interest points, able to cope with different face sizes, is also presented and experimentally validated.
Soccer players identification based on visual local features
BALLAN, LAMBERTO;
2007
Abstract
Semantic detection and recognition of objects and events contained in a video stream has to be performed in order to provide content-based annotation and retrieval of videos. This annotation is done as a means to be able to reuse the video material at a later stage, e.g. to produce new TV programmes. A typical example is that of sports videos, where videos are annotated in order to reuse the video clips that show key highlights and players to produce short summaries for news and sports programmes. In order to select the most interesting actions among all the possibly detected highlights further analysis is required; i.e. the shots that contain a key action are typically followed by close-ups of the players that take part in the action. Therefore the automatic identification of these players would add considerable value both to the annotation and retrieval of the key highlights and key players of a sport event. The problem of detecting and recognizing faces in broadcast videos is a widely studied topic. However, in the case of soccer videos, and sports videos in general, the current techniques are not suitable for the task of face recognition, due to the high variations in pose, illumination, scale and occlusion that may happen in an uncontrolled environment. In this paper a method that copes with these problems, exploiting local features to describe a face, without requiring a precise localization of the distinguishing parts of a face, and the set of poses to describe a person and perform a more robust recognition, is presented. A similarity metric based on the number of matched interest points, able to cope with different face sizes, is also presented and experimentally validated.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.