Skip to main content

A study on the statistical evaluation of classifiers

Neumann, Nadine M., Plastino, Alexandre, Pinto Junior, Jony A., Freitas, Alex A. (2020) A study on the statistical evaluation of classifiers. Knowledge Engineering Review, 36 (e1). pp. 1-26. ISSN 0269-8889. E-ISSN 1469-8005. (doi:10.1017/S0269888920000417) (KAR id:87091)

PDF Author's Accepted Manuscript
Language: English

Click to download this file (681kB) Preview
[thumbnail of Knowl-Eng-Rev-J-2020-Nadine-post-review-final.pdf]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL:


Statistical significance analysis, based on hypothesis tests, is a common approach for comparing classifiers. However, many studies oversimplify this analysis by simply checking the condition p-value < 0.05, ignoring important concepts such as the effect size and the statistical power of the test. This problem is so worrying that the American Statistical Association has taken a strong stand on the subject, noting that although the p-value is a useful statistical measure, it has been abusively used and misinterpreted. This work highlights problems caused by the misuse of hypothesis tests and shows how the effect size and the power of the test can provide important information for better decision-making. To investigate these issues, we perform empirical studies with different classifiers and 50 datasets, using the Student’s t-test and the Wilcoxon test to compare classifiers. The results show that an isolated p-value analysis can lead to wrong conclusions and that the evaluation of the effect size and the power of the test contributes to a more principled decision-making.

Item Type: Article
DOI/Identification number: 10.1017/S0269888920000417
Uncontrolled keywords: data mining, machine learning, classification, statistical significance test
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Alex Freitas
Date Deposited: 13 Mar 2021 15:58 UTC
Last Modified: 27 May 2021 23:00 UTC
Resource URI: (The current URI for this page, for reference purposes)
Freitas, Alex A.:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.