IMPROVING FAIRNESS OF DECISION TREE-BASED CLASSIFICATION ALGORITHMS

Bagriacik, Meryem (2026) IMPROVING FAIRNESS OF DECISION TREE-BASED CLASSIFICATION ALGORITHMS. Doctor of Philosophy (PhD) thesis, University of Kent,. (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:114483)

PDF Language: English Restricted to Repository staff only until May 2029. This work is licensed under a Creative Commons Attribution 4.0 International License.
Contact us about this publication

Abstract

The increasing use of machine learning algorithms in decision-making systems has led to certain concerns about fairness. Although machine learning algorithms, particularly tree-based classification algorithms, have generally been successful in efficiently handling large datasets, they are prone to generating unfair decisions with regard to certain individuals or groups depending on their demographics, such as race, neighbourhood, and age. The reason for this is that such algorithms can encode potential biases from various sources. In practice, the majority of works in the literature have focused on fairness for deep learning or natural language processing (NLP) models, while existing approaches in decision trees generally consider fairness in terms of group fairness. Even if individual fairness has been taken into account by certain works, they are limited in their usage of only a single or, at most, a couple of fairness metrics, particularly overlooking individual fairness. However, one must employ individual and group fairness metrics to combat direct and indirect discrimination in the tree-based algorithms that are widely used for classification tasks in critical domains such as health and finance; accordingly, there is a gap in the use of multiple fairness metrics in tree-based algorithms. Furthermore, although decision tree design typically includes post-pruning to address concerns about overfitting by simplifying complex trees, its use in improving fairness has to date been somewhat limited. Therefore, this thesis addresses the need for the design of new, fair, and non-binary tree-based classification algorithms that fill the gaps in the fairness-related challenges related to the use of tree-based algorithms.

We first tackle the limitation of using only a single or couple of fairness metrics which are more related to group fairness. We introduce Fair-C4.5 algorithm variants, not limited to binary classification tasks, that can effectively include multiple fairness metrics from both group and individual fairness, and which incorporate accuracy gain for controlling the accuracy-fairness trade-offs. Further, Fair-C4.5 considers individual fairness metrics and includes the sensitive feature during the splitting procedure.

Moreover, we investigate how to post-process fairness through pruning the C4.5 tree without negatively affecting accuracy. In particular, for a given C4.5 tree, we propose three distinct pruning strategies that improve the fairness by simplifying trees to mitigate potential overfitting problems.

The final contribution of this thesis is to explore the development of fair random forest algorithms by extending two different widely used random forests. We aim to achieve an accuracy-fairness trade-off in the design of these algorithms by incorporating multiple fairness criteria with information gain. We propose various voting systems to further enhance the fairness of final predictions in new and fair random forest algorithms.

Extensive experiments on a wide range of real-world datasets have shown that these proposed fair tree-based algorithms are competitive with some of the more well-known tree-based algorithms described the literature in terms of accuracy-fairness trade-offs. This thesis addresses unexplored areas of such research in the context of fair tree-based classifiers; thus, to the best of our knowledge, the contributions described herein are entirely original.

Item Type:	Thesis (Doctor of Philosophy (PhD))
Thesis advisor:	Otero, Fernando
Uncontrolled keywords:	fairness, decision tree, random forest, classification, pruning, interpretability
Former Institutional Unit:	There are no former institutional units.
SWORD Depositor:	System Moodle
Depositing User:	System Moodle
Date Deposited:	06 May 2026 14:10 UTC
Last Modified:	07 May 2026 03:22 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/114483 (The current URI for this page, for reference purposes)

University of Kent Author Information

Bagriacik, Meryem.

Creator's ORCID:
CReDIT Contributor Roles:

Depositors only (login required):

Total Views

Total unique views of this page since July 2020. For more details click on the image.