Detecting Cyber Security Related Twitter Accounts and Different Sub-Groups: A Multi-Classifier Approach

Mahaini, Mohamad Imad, Li, Shujun (2021) Detecting Cyber Security Related Twitter Accounts and Different Sub-Groups: A Multi-Classifier Approach. In: ASONAM '21: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. . ACM ISBN 978-1-4503-9128-3. (doi:10.1145/3487351.3492716) (KAR id:90995)

PDF Publisher pdf Language: English This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Download this file (PDF/1MB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: https://dl.acm.org/doi/abs/10.1145/3487351.3492716

Abstract

Many cyber security experts, organizations, and cyber criminals are active users on online social networks (OSNs). Therefore, detecting cyber security related accounts on OSNs and monitoring their activities can be very useful for different purposes such as cyber threat intelligence, detecting and preventing cyber attacks and online harms on OSNs, and evaluating the effectiveness of cyber security awareness activities on OSNs.

In this paper, we report our work on developing several machine learning based classifiers for detecting cyber security related accounts on Twitter, including a base-line classifier for detecting cyber security related accounts in general, and three sub-classifiers for detecting three subsets of cyber security related accounts (individuals, hackers, and academia). To train and test the classifiers, we followed a more systemic approach (based on a cyber security taxonomy, real-time sampling of tweets, and crowdsourcing) to construct a dataset of cyber security related accounts with multiple tags assigned to each account. For each classifier, we considered a richer set of features than those used in past studies. Among five machine learning models tested, the Random Forest model achieved the best performance: 93% for the base-line classifier, 88-91% for the three sub-classifiers. We also studied feature reduction of the base-line classifier and showed that using just six features we can already achieve the same performance.

Item Type:	Conference or workshop item (Proceeding)
DOI/Identification number:	10.1145/3487351.3492716
Projects:	European Union’s Horizon 2020 Research and Innovation program under the MarieSkłodowska-Curie NeCS project
Uncontrolled keywords:	Cyber Security, Machine Learning, Classification, OSN, Online Social Network, Twitter, Crowdsourcing, Cyber Threat Intelligence, OSINT, Open Source Intelligence
Subjects:	Q Science
Divisions:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Funders:	European Commission (https://ror.org/00k4n6c32)
Depositing User:	Imad Mahaini
Date Deposited:	20 Oct 2021 18:43 UTC
Last Modified:	04 Mar 2024 19:16 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/90995 (The current URI for this page, for reference purposes)

University of Kent Author Information

Mahaini, Mohamad Imad.

Creator's ORCID:	https://orcid.org/0000-0001-9889-7837
CReDIT Contributor Roles:

Li, Shujun.

Creator's ORCID:	https://orcid.org/0000-0001-5628-7328
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Download Statistics

Total unique views for this document in KAR since July 2020. For more details click on the image.