I tried a bunch of things: The dangers of unexpected overfitting in classification of brain data

Hosseini, Mahan, Powell, Michael, Collins, John, Callahan-Flintoft, Chloe, Jones, William, Bowman, Howard, Wyble, Brad (2020) I tried a bunch of things: The dangers of unexpected overfitting in classification of brain data. Neuroscience & Biobehavioral Reviews, 119 . pp. 456-467. ISSN 0149-7634. (doi:10.1016/j.neubiorev.2020.09.036) (KAR id:84806)

PDF Author's Accepted Manuscript Language: English This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Download this file (PDF/1MB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: https://doi.org/10.1016/j.neubiorev.2020.09.036

Abstract

Machine learning has enhanced the abilities of neuroscientists to interpret information collected through EEG, fMRI, and MEG data. With these powerful techniques comes the danger of overfitting of hyperparameters which can render results invalid. We refer to this problem as ‘overhyping’ and show that it is pernicious despite commonly used precautions. Overhyping occurs when analysis decisions are made after observing analysis outcomes and can produce results that are partially or even completely spurious. It is commonly assumed that cross-validation is an effective protection against overfitting or overhyping, but this is not actually true. In this article, we show that spurious results can be obtained on random data by modifying hyperparameters in seemingly innocuous ways, despite the use of cross-validation. We recommend a number of techniques for limiting overhyping, such as lock boxes, blind analyses, pre-registrations, and nested cross-validation. These techniques, are common in other fields that use machine learning, including computer science and physics. Adopting similar safeguards is critical for ensuring the robustness of machine-learning techniques in the neurosciences.

Item Type:	Article
DOI/Identification number:	10.1016/j.neubiorev.2020.09.036
Uncontrolled keywords:	Overfitting, Overhyping, Machine learning, Classification, Analysis, EEG
Subjects:	Q Science
Institutional Unit:	Schools > School of Computing
Former Institutional Unit:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Howard Bowman
Date Deposited:	14 Dec 2020 10:02 UTC
Last Modified:	28 Apr 2026 09:16 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/84806 (The current URI for this page, for reference purposes)

University of Kent Author Information

Hosseini, Mahan.

Creator's ORCID:
CReDIT Contributor Roles:

Bowman, Howard.

Creator's ORCID:	https://orcid.org/0000-0003-4736-1869
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.