Sun, Zhongtian, Harit, Anoushka, Cristea, Alexandra I., Yu, Jialin, Moubayed, Noura Al, Shi, Lei (2023) Is Unimodal Bias Always Bad for Visual Question Answering? A Medical Domain Study with Dynamic Attention. In: 2022 IEEE International Conference on Big Data. . pp. 5352-5360. IEEE ISBN 978-1-6654-8045-1. (doi:10.1109/BigData55660.2022.10020791) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:108674)
The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. (Contact us about this Publication) | |
Official URL: https://doi.org/10.1109/BigData55660.2022.10020791 |
Abstract
Medical visual question answering (Med-VQA) is to answer medical questions based on clinical images provided. This field is still in its infancy due to the complexity of the trio formed of questions, multimodal features and expert knowledge. In this paper, we tackle, a ’myth’ in the Natural Language Processing area - that unimodal bias is always considered undesirable in learning models. Additionally, we study the effect of integrating a novel dynamic attention mechanism into such models, inspired by a recent graph deep learning study.Unlike traditional attention, dynamic attention scores are conditioned on different query words in a question and thus enhance the representation learning ability of texts. We propose that some questions are answered more accurately with a reinforcement of question embedding after fusing multimodal features. Extensive experiments have been implemented on the VQA-RAD datasets and demonstrate that our proposed model, reinforCe unimOdal dynamiC Attention (COCA), outperforms the state-of-the-art methods overall and performs competitively at open-ended question answering.
Item Type: | Conference or workshop item (Proceeding) |
---|---|
DOI/Identification number: | 10.1109/BigData55660.2022.10020791 |
Subjects: | Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Zhongtian Sun |
Date Deposited: | 06 Feb 2025 16:19 UTC |
Last Modified: | 10 Feb 2025 22:13 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/108674 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):