Skip to main content
Kent Academic Repository

Discrimination with many variables

Brown, Philip J., Fearn, T., Haque, Masudul (1999) Discrimination with many variables. Journal of the American Statistical Association, 94 (448). pp. 1320-1329. ISSN 0162-1459. (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:17133)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.

Abstract

Many statistical methods for discriminant analysis do not adapt well or easily to situations where the number of variables is large, possibly even exceeding the number of cases in the training set. We explore a variety of methods for providing robust identification of future samples in this situation. We develop a range of flexible Bayesian methods, and primarily a new hierarchical covariance compromise method, akin to regularized discriminant analysis. Although the methods are much more widely applicable, the motivating problem was that of discriminating between groups of samples on the basis of their near-infrared spectra. Here the ability of the Bayesian methods to rake account of continuity of the spectra may be beneficial. The spectra may consist of absorbances or reflectances at as many as 1,000 wavelengths, and yet there may be only tens or hundreds of training samples in which both sample spectrum and group identity are known. Such problems arise in the food and pharmaceutical industries; for example, authentication of foods (e.g., detecting the adulteration of orange juice) and identification of pharmaceutical ingredients. Our illustrating example concerns the discrimination of 39 microbiological taxa and 8 aggregate genera. Simulations also illustrate the effectiveness of the hierarchical Bayes covariance method. We discuss a number of scoring rules, both local and global, for judging the fit of data to the Bayesian models, and adopt a cross-classificatory approach for estimating hyperparameters.

Item Type: Article
Uncontrolled keywords: Bayesian methods; cross-validation; discrimination; Gaussian processes; hierarchical covariances; scoring rules; smoothing; spectroscopy
Subjects: Q Science > QA Mathematics (inc Computing science)
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Mathematics, Statistics and Actuarial Science
Depositing User: M. Nasiriavanaki
Date Deposited: 06 Jul 2009 07:55 UTC
Last Modified: 05 Nov 2024 09:52 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/17133 (The current URI for this page, for reference purposes)

University of Kent Author Information

Brown, Philip J..

Creator's ORCID:
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.