Skip to main content
Kent Academic Repository

Automatic Information Extraction from Electronic Documents using Machine Learning

Kamaleson, Nishanthan, Chu, Dominique, Otero, Fernando E.B. (2021) Automatic Information Extraction from Electronic Documents using Machine Learning. In: Lecture Notes in Computer Science. 41st SGAI International Conference on Artificial Intelligence, AI 2021, Cambridge, UK, December 14–16, 2021, Proceedings. 13101. pp. 183-194. Springer ISBN 978-3-030-91099-0. E-ISBN 978-3-030-91100-3. (doi:10.1007/978-3-030-91100-3_16) (KAR id:91696)

Abstract

The digital processing of electronic documents is widely exploited across many domains to improve the efficiency of information extraction. However, paper documents are still largely being used in practice. In order to process such documents, a manual procedure is used to inspect them and extract the values of interest. As this task is monotonous and time consuming, it is prone to introduce human errors during the process. In this paper, we present an efficient and robust system that automates the aforementioned task by using a combination of machine learning techniques: optical character recognition, object detection and image processing techniques. This not only speeds up the process but also improves the accuracy of extracted information compared to a manual procedure.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.1007/978-3-030-91100-3_16
Uncontrolled keywords: OCR, Layout analysis, Image detection, Information extraction
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Fernando Otero
Date Deposited: 23 Nov 2021 11:00 UTC
Last Modified: 10 Feb 2022 11:44 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/91696 (The current URI for this page, for reference purposes)

University of Kent Author Information

Kamaleson, Nishanthan.

Creator's ORCID:
CReDIT Contributor Roles:

Chu, Dominique.

Creator's ORCID: https://orcid.org/0000-0002-3706-2905
CReDIT Contributor Roles:

Otero, Fernando E.B..

Creator's ORCID: https://orcid.org/0000-0003-2172-297X
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.