Skip to main content

Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation

Freitas, Alex A., Pinto Junior, Jony A., Plastino, Alexandre, Neumann, Nadine M. (2018) Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation. In: Annals of the National Meeting of Artificial and Computational Intelligence (ENIAC). . pp. 94-103. (doi:10.5753/eniac.2018.4407)

PDF - Publisher pdf
Download (180kB) Preview
Official URL


A common tool used in the process of comparing classifiers is the statistical significance analysis, performed through the hypothesis test. However, there are many researchers attempting to obtain statistical significance through a blinding evaluating of the p-value<0.05 condition, ignoring important concepts such as the effect size and statistical power. This work highlights possible problems caused by the misuse of the hypothesis test and how the effect size and the statistical power can provide information for a better decision making. Therefore, two case studies applying Student’s t-test and Wilcoxon signed-rank test for the comparison of two classifiers are presented.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.5753/eniac.2018.4407
Divisions: Faculties > Sciences > School of Computing
Faculties > Sciences > School of Computing > Computational Intelligence Group
Depositing User: Alex Freitas
Date Deposited: 01 Feb 2019 13:05 UTC
Last Modified: 30 May 2019 08:52 UTC
Resource URI: (The current URI for this page, for reference purposes)
  • Depositors only (login required):


Downloads per month over past year