Automatic Instrument and Environmental Sound Recognition for Media Annotation of TV Content

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Due to the lack of annotation of their large videoarchives, multimedia content provider companies andtelevision channels do not use the data in their archivesto their full extent. In order to contribute with a solutionto this problem, we have developed a tool that combinesaudio and visual information to annotate video.In particular, this tool has been used by a video productioncompany that has given us positive feedback. Themain innovation of this tool is the use of environmentalsound recognition to annotate video. Here we focus onthe tool's audio information extraction method, whichconsists of a sound recognizer that learns a small set ofspectral features from the data using non-negative matrixfactorization. The recognizer can be used for differentpurposes such as to classify musical instruments, toidentify the notes that are played and to distinguish environmentalsounds like water, traffic, trains and people.
Original languageUnknown
Title of host publicationProceedings of the IEEE International Conference on Audio, Language and Image Processing
Pages1125-1130
DOIs
Publication statusPublished - 1 Jan 2012
EventIEEE International Conference on Audio, Language and Image Processing -
Duration: 1 Jan 2012 → …

Conference

ConferenceIEEE International Conference on Audio, Language and Image Processing
Period1/01/12 → …

Cite this