Skip to main content

Parallel Computing TEDA for High Frequency Streaming Data Clustering

Gu, Xiaowei, Angelov, Plamen P., Gutierrez, German, Iglesias, Jose Antonio, Sanchis, Araceli (2016) Parallel Computing TEDA for High Frequency Streaming Data Clustering. In: INNS 2016: Advances in Big Data. INNS Conference on Big Data. Advances in Intelligent Systems and Computing book series , 529. pp. 238-253. Springer, Cham ISBN 978-3-319-47897-5. E-ISBN 978-3-319-47898-2. (doi:10.1007/978-3-319-47898-2_25) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:90213)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. (Contact us about this Publication)
Official URL
https://doi.org/10.1007/978-3-319-47898-2_25

Abstract

In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.

Item Type: Conference or workshop item (Paper)
DOI/Identification number: 10.1007/978-3-319-47898-2_25
Uncontrolled keywords: High frequency streaming data; TEDA; Parallel computation; Clustering; Real time
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Amy Boaler
Date Deposited: 14 Sep 2021 14:46 UTC
Last Modified: 15 Sep 2021 14:50 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/90213 (The current URI for this page, for reference purposes)
  • Depositors only (login required):