Skip to main content
Kent Academic Repository

Parallelism and partitioning in large-scale GAs using spark

Alterkawi, Laila, Migliavacca, Matteo (2019) Parallelism and partitioning in large-scale GAs using spark. In: GECCO '19 Proceedings of the Genetic and Evolutionary Computation Conference. . pp. 736-744. ACM, New York ISBN 978-1-4503-6111-8. (doi:10.1145/3321707.3321775) (KAR id:75345)

Abstract

Big Data promises new scientific discovery and economic value. Genetic algorithms (GAs) have proven their flexibility in many application areas and substantial research effort has been dedicated to improving their performance through parallelisation. In contrast with most previous efforts we reject approaches that are based on the centralisation of data in the main memory of a single node or that require remote access to shared/distributed memory. We focus instead on scenarios where data is partitioned across machines.

In this partitioned scenario, we explore two parallelisation models: PDMS, inspired by the traditional master-slave model, and PDMD, based on island models; we compare their performance in large-scale classification problems. We implement two distributed versions of Bio-HEL, a popular large-scale single-node GA classifier, using the Spark distributed data processing platform. In contrast to existing GA based on MapReduce, Spark allows a more efficient implementation of parallel GAs thanks to its simple, efficient iterative processing of partitioned datasets.

We study the accuracy, efficiency and scalability of the proposed models. Our results show that PDMS provides the same accuracy of traditional BioHEL and exhibit good scalability up to 64 cores, while PDMD provides substantial reduction of execution time at a minor loss of accuracy.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.1145/3321707.3321775
Uncontrolled keywords: Genetic Algorithms, Big Data, Spark, Distributed Learning Classifier System, Distributed Data Mining
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming, > QA76.76 Computer software
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Matteo Migliavacca
Date Deposited: 15 Jul 2019 11:27 UTC
Last Modified: 05 Nov 2024 12:38 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/75345 (The current URI for this page, for reference purposes)

University of Kent Author Information

Alterkawi, Laila.

Creator's ORCID:
CReDIT Contributor Roles:

Migliavacca, Matteo.

Creator's ORCID: https://orcid.org/0000-0002-5684-4865
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.