Perusquia Cortes, Jose Antonio (2022) Bayesian Nonparametric Methods for Cyber Security with Applications to Malware Detection and Classification. Doctor of Philosophy (PhD) thesis, University of Kent,. (doi:10.22024/UniKent/01.02.93553) (KAR id:93553)
PDF
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
Download this file (PDF/2MB) |
Preview |
Official URL: https://doi.org/10.22024/UniKent/01.02.93553 |
Abstract
The statistical approach to cyber security has become an active and important area of research due to the growth in number and threat of cyber attacks perpetrated nowadays. In this thesis, we centre our attention on the Bayesian approach to cyber security, which provides several modelling advantages such as the flexibility achieved through the probabilistic quantification of uncertainty. In particular, we have found that Bayesian models have been mainly used to detect volume-traffic anomalies, network anomalies and malicious software. To provide a unifying view of these ideas, we first present a thorough review on Bayesian methods applied to cyber security.
Bayesian models applied to detecting malware and classifying them into known malicious classes is one of the cyber security areas discussed in our review. However, and contrary to detecting traffic and network anomalies, this area has not been widely developed from a Bayesian perspective. That is why we have centred our attention on developing novel supervised learning Bayesian nonparametric models to detect and classify malware using binary features built directly from the executables’ binary code. For these methods, important theoretical properties and simulation techniques are fully developed and for real malware data, we have compared their performance against well-known machine learning models which have been widely applied in this area.
With respect to our methodologies, we first present a new discrete nonparametric prior specifically designed for binary data that builds on an elegant nonparametric hierarchical structure, which allows us to study the importance of each individual feature across the groups found in the data. Moreover, and due to the large, and possibly redundant, number of features, we have developed a generalised version of the model that allows the introduction of a feature selection step within the inferential learning. Finally, for a more complex modelling where there is a need to introduce dependence across the features, we have extended the capabilities of this new class of nonparametric priors by using it as the building block of a latent feature model.
Item Type: | Thesis (Doctor of Philosophy (PhD)) |
---|---|
DOI/Identification number: | 10.22024/UniKent/01.02.93553 |
Subjects: | H Social Sciences > HA Statistics |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Mathematics, Statistics and Actuarial Science |
SWORD Depositor: | System Moodle |
Depositing User: | System Moodle |
Date Deposited: | 14 Mar 2022 09:23 UTC |
Last Modified: | 05 Nov 2024 12:58 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/93553 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):