A framework for automatic semantic video annotation: Utilizing similarity and commonsense knowledge bases

Altadmri, Amjad, Ahmed, Amr (2014) A framework for automatic semantic video annotation: Utilizing similarity and commonsense knowledge bases. Multimedia Tools and Applications, 72 (2). pp. 1167-1191. ISSN 1380-7501. E-ISSN 1573-7721. (doi:10.1007/s11042-013-1363-6) (KAR id:45948)

PDF Publisher pdf Language: English
Download this file (PDF/2MB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: http://dx.doi.org/10.1007/s11042-013-1363-6

Abstract

The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the ‘semantic gap’. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation.

Item Type:	Article
DOI/Identification number:	10.1007/s11042-013-1363-6
Uncontrolled keywords:	Video Annotation, Semantic Video Annotation, Automatic Semantic Video Annotation, semantic gap, Video Retrieval, video search engine, Video Information Retrieval, Commonsense Knowledgebase, Commonsense Knowledgebases, Commonsense Knowledge bases, Video matching, Video Similarity
Subjects:	Q Science > Q Science (General) > Q335 Artificial intelligence Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming, > QA76.575 Multimedia systems Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming, > QA76.76.E95 Expert Systems (Intelligent Knowledge Based Systems) T Technology > TA Engineering (General). Civil engineering (General) > TA1637 Image processing
Institutional Unit:	Schools > School of Computing Schools > School of Engineering, Mathematics and Physics > Engineering
Former Institutional Unit:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Engineering and Digital Arts
Depositing User:	Amjad Altadmri
Date Deposited:	10 Dec 2014 12:01 UTC
Last Modified:	20 May 2025 10:15 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/45948 (The current URI for this page, for reference purposes)

University of Kent Author Information

Altadmri, Amjad.

Creator's ORCID:
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.