Skip to main content

Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation

Freitas, Alex A., Pinto Junior, Jony A., Plastino, Alexandre, Neumann, Nadine M. (2018) Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation. In: Annals of the National Meeting of Artificial and Computational Intelligence (ENIAC). . pp. 94-103. (doi:10.5753/eniac.2018.4407) (KAR id:72111)


A common tool used in the process of comparing classifiers is

the statistical significance analysis, performed through the hypothesis test.

However, there are many researchers attempting to obtain statistical significance through a blinding evaluating of the p-value<0.05 condition, ignoring

important concepts such as the effect size and statistical power. This work

highlights possible problems caused by the misuse of the hypothesis test and

how the effect size and the statistical power can provide information for

a better decision making. Therefore, two case studies applying Student’s

t-test and Wilcoxon signed-rank test for the comparison of two classifiers

are presented.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.5753/eniac.2018.4407
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Alex Freitas
Date Deposited: 01 Feb 2019 13:05 UTC
Last Modified: 16 Feb 2021 14:02 UTC
Resource URI: (The current URI for this page, for reference purposes)

University of Kent Author Information

  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.