Skip to main content
Kent Academic Repository

THE IMPACT OF VARIABLE SELECTION ON THE MODELLING OF OESTROGENICITY

Ghafourian, Taravat, Cronin, Mark T.D. (2005) THE IMPACT OF VARIABLE SELECTION ON THE MODELLING OF OESTROGENICITY. SAR and QSAR in Environmental Research, 16 (1-2). pp. 171-190. ISSN 1062-936X. (doi:10.1080/10629360412331319808) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:10177)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.
Official URL:
http://dx.doi.org/10.1080/10629360412331319808

Abstract

Many oestrogenic chemicals exert their activity via specific interactions with the oestrogen receptor (ER).

The objective of the present study was to identify significant descriptors associated with the ER binding affinities of a large and diverse set of compounds to drive quantitative structure–activity relationships (QSARs). To this end,a variety of statistical methods were employed for variable selection. These included stepwise regression and partial least squares (PLS) analyses, as well as a non-linear recursive partitioning method (Formal Inference-based Recursive Modelling). A total of 157 molecular descriptors including quantum mechanical, graph theoretical, indicator variables and log P were used in the study. Furthermore, cluster analysis of variables was performed to identify groups of descriptors representing similar molecular features. Hierarchical PLS analyses were performed,where the scores of the significant components of either PLS or principle component analysis (PCA), performed separately on each cluster, were used as the variables for the top model. This reduced the number of the variables representing the larger clusters, leading to a similar number of descriptors for each distinct molecular feature. The results showed that the most important molecular properties for stronger ER binding affinity are molecular size and shape, the presence of a phenol moiety as well as other aromatic groups, hydrophobicity and presence of double bonds. The best PLS model obtained, in terms of predictive ability, was a hierarchical PLS model. However,a rigorous validation study showed that the MLR model using descriptors selected by stepwise regression has greater predictive power than the PLS models.

Item Type: Article
DOI/Identification number: 10.1080/10629360412331319808
Subjects: Q Science
Q Science > QD Chemistry
Divisions: Divisions > Division of Natural Sciences > Medway School of Pharmacy
Depositing User: Taravat Ghafourian
Date Deposited: 06 Sep 2008 00:27 UTC
Last Modified: 16 Nov 2021 09:48 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/10177 (The current URI for this page, for reference purposes)

University of Kent Author Information

Ghafourian, Taravat.

Creator's ORCID:
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.