Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation

Freitas, Alex A., Pinto Junior, Jony A., Plastino, Alexandre, Neumann, Nadine M. (2018) Is P-value<0.05 Enough? Two Case Studies in Classifiers Evaluation. In: Annals of the National Meeting of Artificial and Computational Intelligence (ENIAC). . pp. 94-103. (doi:10.5753/eniac.2018.4407) (KAR id:72111)

PDF Publisher pdf Language: English
Download this file (PDF/253kB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: https://doi.org/10.5753/eniac.2018.4407

Abstract

A common tool used in the process of comparing classifiers is

the statistical significance analysis, performed through the hypothesis test.

However, there are many researchers attempting to obtain statistical significance through a blinding evaluating of the p-value<0.05 condition, ignoring

important concepts such as the effect size and statistical power. This work

highlights possible problems caused by the misuse of the hypothesis test and

how the effect size and the statistical power can provide information for

a better decision making. Therefore, two case studies applying Student’s

t-test and Wilcoxon signed-rank test for the comparison of two classifiers

are presented.

Item Type:	Conference or workshop item (Proceeding)
DOI/Identification number:	10.5753/eniac.2018.4407
Divisions:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Alex Freitas
Date Deposited:	01 Feb 2019 13:05 UTC
Last Modified:	05 Nov 2024 12:34 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/72111 (The current URI for this page, for reference purposes)

University of Kent Author Information

Freitas, Alex A..

Creator's ORCID:	https://orcid.org/0000-0001-9825-4700
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.