Skip to main content
Kent Academic Repository

An Investigation into Generating High-quality, Diversified Datasets of Microbiological Images for Supervised Computer Vision Models

Mirzaee Bafti, Saber (2022) An Investigation into Generating High-quality, Diversified Datasets of Microbiological Images for Supervised Computer Vision Models. Doctor of Philosophy (PhD) thesis, University of Kent,. (doi:10.22024/UniKent/01.02.99575) (KAR id:99575)

Abstract

Supervised deep neural networks need datasets for training, in which the data need to be annotated before use. For developing a reliable deep neural network, the datasets should meet some criteria including high-quality annotation, diversity, and abundance of data. Generation of such datasets is costly and time-consuming, especially in the case of image datasets. This is due to reasons including inaccessibility to large-scale and diverse images, as well as the laborious process of image annotation. These problems are exacerbated in the medical domain since medical image collection is more expensive, and their annotation requires in-depth domain knowledge. Thus, big data and high-quality annotation are two of the most difficult challenges in annotation of medical images, not to mention ethical considerations. The computer vision community has put forward a lot of effort to tackle these challenges, e.g., by using computer techniques for synthetically generating low-cost (economically, time-wise, etc) images or using computer techniques to facilitate the annotation process. Despite intensive efforts, many aspects of the domain and solutions remain understudied. For example, in crowdsourcing, which is a common way of generating rapid and cost-effective annotation, there is the risk of having lowskilled annotators, which degrades the annotation quality. Moreover, the tedious nature of some annotation tasks can detrimentally affect annotators’ quality in the prolonged annotation processes (even for the skilled workers). Thus, in this Ph.D. thesis, some of these challenges were comprehensively explored and some solutions, focusing on three studies outlined as follows were proposed to bridge these gaps.

First, as the prerequisite of this Ph.D. thesis, a web-based annotation platform was developed for image datasets annotations, powered by a crowdsourcing tool that has been utilized for the forthcoming studies. This platform is now available online at www.aiconsole.com. Furthermore, a dataset of microbiological images of three different parasite groups were collected and annotated by the biologist research partners. In the first study, we compared the performance of an AI-based assistive tool to help annotators (also known as crowd workers or crowd annotators in crowdsourcing context) with microbiological image annotation with that of manual annotation. To accomplish this, the web-based annotation platform was integrated with a novel assistive tool (based

on a weakly trained object detection model), and a two-day experiment (i.e. with using and not using assisitive tool, respectivly) with crowd workers was conducted in two modes:

i) AI-based assistive annotation and ii) manual annotation. A set of quantitative evaluations were conducted in order to assess the annotators' behaviour and the assistive tool's performance. Overall, the results showed how this assistive tool based on a weakly trained object detection model can decrease the annotation cost (measured by time and number of clicks). Derived from the findings of this study, some recommendations on how future platforms with the same assistive tool can be designed to more engage the annotators to the task for a better performance are provided. Due to the lack of more conclusive results related to annotators' behaviour, and fatigues effect on annotators' performance, the platform was upgraded with additional tools to address other research questions in the next study.

The second study, aimed to answer three research questions. i) How crowd workers' performance changes over time when involved in a prolonged task ii) feasibility of assessing annotators’ fatigue and performance via annotation-based and mouse-based features iii) assessing a new aggregation technique to combine crowd workers annotations with respect to their annotations’ estimated quality. In this study, we found an increase and decrease in annotators’ performance (as measured by the Dice Similarity Coefficient; DSC) as a function of learning and fatigue effects whereas workers in the learing region gained experience resulting in better performance, while in the fatigue region their performance detoriated. A set of extracted annotation-related and mouse-related features demonstrated a strong correlation with the workers' quality and fatigue level, which motivated the creation of regression models for estimating workers' performance.

Additionally, we proposed a new Weighted Majority Voting (WMV) method for aggregating annotations that takes into account the estimated quality of each individual annotation. In comparison with the benchmark aggregation techniques (conventional majority voting and STAPLE), the new aggregation method showed a relative improvement in the mean and variance of DSCs. The third study, tackled the lack of diversity in microbiology image datasets by developing a GAN-based image-to-image translation model (BioGAN) for converting microbiology images, taken in the lab into images with the visual characteristics of images taken in the field. This study was motivated by the fact that collecting microbiological images in the field is not as simple and affordable as lab-based image collection. By adding a Perceptual loss (including two elements of Content reconstruction loss and Style reconstruction loss)

to the Adversarial loss of a classical GAN network, the difference between high-level (texture) features of the synthetic image and a real-world field image has been penalised.

Then, the proposed BioGAN model was tested on its ability to translate laboratory-taken images of Prototheca into field-like images, using experts’ qualitative evaluation and quantitative evaluation by the Mask R-CNN object detection framework. We found that the generated images helped to boost diversity as well as the volume of the dataset. In synthetically generated images, the spatial characteristics remain the same (i.e., the cells remain in the same position with the same dimension), which means that the

annotations for the lab-taken images are valid and usable for synthetic field images, which reduces the cost of annotation. These findings and developed models extended theoretical and practical knowledge in the area of medical image annotation, creating a low-cost but high-quality image dataset for supervised computer vision models based on neural networks. Specifically, the contribution lies in i) providing AI-based tools for computer vision practitioners and researchers to generate cost-effective yet high-quality annotations on image datasets, ii) developing a set of guidelines to help developers design better crowdsourcing platforms, iii) understanding users' behaviour and interactions in crowdsourcing environments, iv) aggregating annotations from crowdsourcing workers more effectively, v) the potential use of a GAN model for enhancing the diversity of image datasets. Also, as one of the major practical contributions of this PhD, the crowdsourcing image annotation platform, and the codes for the image translation model have been published for use by practitioners.

Item Type: Thesis (Doctor of Philosophy (PhD))
Thesis advisor: Siang Ang, Chee
DOI/Identification number: 10.22024/UniKent/01.02.99575
Uncontrolled keywords: Computational biology, Image segmentation, Crowdsourcing, User behaviour, Image translation, GAN network
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Engineering and Digital Arts
SWORD Depositor: System Moodle
Depositing User: System Moodle
Date Deposited: 19 Jan 2023 10:10 UTC
Last Modified: 01 Jan 2024 00:00 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/99575 (The current URI for this page, for reference purposes)

University of Kent Author Information

Mirzaee Bafti, Saber.

Creator's ORCID:
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.