Skip to main content

Fisher Vector based CNN architecture for Image Classification

Song, Yan and Wang, Peiseng and Hong, Xinhai and McLoughlin, Ian (2018) Fisher Vector based CNN architecture for Image Classification. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE. ISBN 978-1-5090-2176-5. E-ISBN 978-1-5090-2175-8. (doi:10.1109/ICIP.2017.8296344) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided)

PDF - Author's Accepted Manuscript
Restricted to Repository staff only
Contact us about this Publication Download (235kB)
[img]
Official URL
http://dx.doi.org/10.1109/ICIP.2017.8296344

Abstract

In this paper, we tackle the representation learning problem for small scale fine-grained object recognition and scene classification tasks. Conventional bag of features(BoF) methods exploit hand-crafted frontend local features, and learn the representations via various machine learning techniques. Convolutional neural networks(CNN) directly learn the representation from raw images and benefit from joint optimization of network parameters in an end-to-end manner. However, the performance of existing representation learning methods is still unsatisfactory for the small-scale recognition tasks. To address this issue, we present a FV coding based CNN(FV-CNN) architecture. FV-CNN has three main advantages in that firstly it is able to exploit activations from the intermediate convolutional layer and a probabilistic discriminative model to derive the FV coding. Secondly, it takes advantage of the end-to-end back-propagation of the gradients to jointly optimize the whole learning process. Finally, it can learn a compact representation. When evaluated on benchmark datasets of fine grain object recognition (Caltech-CUB200), and scene classification (MIT67), accuracies of 88.0% and 82.2% are achieved.

Item Type: Book section
DOI/Identification number: 10.1109/ICIP.2017.8296344
Uncontrolled keywords: Image Classification, Visual Representation, Convolutional Neural Network, End-to-End Training
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing > Data Science
Depositing User: Ian McLoughlin
Date Deposited: 04 Sep 2017 15:13 UTC
Last Modified: 26 Sep 2019 08:52 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/63217 (The current URI for this page, for reference purposes)
McLoughlin, Ian: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year