Skip to main content

Automatic Information Extraction from Electronic Documents using Machine Learning

Kamaleson, Nishanthan, Chu, Dominique, Otero, Fernando E.B. (2021) Automatic Information Extraction from Electronic Documents using Machine Learning. In: Lecture Notes in Computer Science. 41st SGAI International Conference on Artificial Intelligence, AI 2021, Cambridge, UK, December 14–16, 2021, Proceedings. 13101. pp. 183-194. Springer ISBN 978-3-030-91099-0. E-ISBN 978-3-030-91100-3. (doi:10.1007/978-3-030-91100-3_16) (KAR id:91696)

PDF Author's Accepted Manuscript
Language: English

Download (241kB) Preview
[thumbnail of Kamaleson_SGAI2021_preprint.pdf]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL


The digital processing of electronic documents is widely exploited across many domains to improve the efficiency of information extraction. However, paper documents are still largely being used in practice. In order to process such documents, a manual procedure is used to inspect them and extract the values of interest. As this task is monotonous and time consuming, it is prone to introduce human errors during the process. In this paper, we present an efficient and robust system that automates the aforementioned task by using a combination of machine learning techniques: optical character recognition, object detection and image processing techniques. This not only speeds up the process but also improves the accuracy of extracted information compared to a manual procedure.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.1007/978-3-030-91100-3_16
Uncontrolled keywords: OCR, Layout analysis, Image detection, Information extraction
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Fernando Otero
Date Deposited: 23 Nov 2021 11:00 UTC
Last Modified: 10 Feb 2022 11:44 UTC
Resource URI: (The current URI for this page, for reference purposes)
Chu, Dominique:
Otero, Fernando E.B.:
  • Depositors only (login required):


Downloads per month over past year