Song, Yan and Hong, Xinhai and McLoughlin, Ian and Dai, Lirong (2017) Image Classification with CNN-based Fisher Vector Coding. In: 2016 Visual Communications and Image Processing (VCIP). IEEE. ISBN 978-1-5090-5317-9. E-ISBN 978-1-5090-5316-2. (doi:10.1109/VCIP.2016.7805494) (KAR id:57115)
PDF
Author's Accepted Manuscript
Language: English
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
|
|
Download this file (PDF/283kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: https://dx.doi.org/10.1109/VCIP.2016.7805494 |
Abstract
Fisher vector coding methods have been demonstrated to be effective for image classification. With the help of convolutional neural networks (CNN), several Fisher vector coding methods have shown state-of-the-art performance by adopting the activations of a single fully-connected layer as region features. These methods generally exploit a diagonal Gaussian mixture model (GMM) to describe the generative process of region features. However, it is difficult to model the complex distribution of high-dimensional feature space with a limited number of Gaussians obtained by unsupervised learning. Simply increasing the number of Gaussians turns out to be inefficient and computationally impractical.
To address this issue, we re-interpret a pre-trained CNN as the probabilistic discriminative model, and present a CNN based Fisher vector coding method, termed CNN-FVC. Specifically, activations of the intermediate fully-connected and output soft-max layers are exploited to derive the posteriors, mean and covariance parameters for Fisher vector coding implicitly. To further improve the efficiency, we convert the pre-trained CNN to a fully convolutional one to extract the region features. Extensive experiments have been conducted on two standard scene benchmarks (i.e. SUN397 and MIT67) to evaluate the effectiveness of the proposed method. Classification accuracies of 60.7% and 82.1% are achieved on the SUN397 and MIT67 benchmarks respectively, outperforming previous state-of-the-art approaches. Furthermore, the method is complementary to GMM-FVC methods, allowing a simple fusion scheme to further improve performance to 61.1% and 83.1% respectively.
Item Type: | Book section |
---|---|
DOI/Identification number: | 10.1109/VCIP.2016.7805494 |
Additional information: | Received a best paper award |
Uncontrolled keywords: | Image Classification, Convolutional Neural Network, Gaussian Mixture Model, Fisher Vector Coding |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 03 Dec 2016 16:30 UTC |
Last Modified: | 05 Nov 2024 10:47 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/57115 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):