Skip to main content

Scoring and estimating score precision using multidimensional IRT

Brown, Anna and Croudace, Tim J (2015) Scoring and estimating score precision using multidimensional IRT. In: Reise, Steven P. and Revicki, Dennis A., eds. Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment. Multivariate Applications Series . Taylor & Francis (Routledge), New York, pp. 307-333. ISBN 978-1-84872-972-8. (KAR id:40794)

PDF Author's Accepted Manuscript
Language: English
Download (1MB) Preview
[thumbnail of Scoring and reliability chapter R1 for sharing.pdf]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL


The ultimate goal of measurement is to produce a score by which individuals can be assessed and differentiated. Item response theory (IRT) modeling views responses to test items as indicators of a respondent’s standing on some underlying psychological attributes (van der Linden & Hambleton, 1997) – we often call them latent traits – and devises special algorithms for estimating this standing. This chapter gives an overview of methods for estimating person attribute scores using one-dimensional and multi-dimensional IRT models, focusing on those that are particularly useful with patient-reported outcome (PRO) measures.

Patient-reported outcome measures often capture several related constructs, the feature that may make the use of multi-dimensional IRT models appropriate and beneficial (Gibbons, Immekus & Bock, 2007). Several such models are described, including a model with multiple correlated constructs, a model where multiple constructs are underlain by a general common factor (second-order model), and a model where each item is influenced by one general and one group factor (bifactor model). To make the use of these models more easily accessible for applied researchers, we provide specialized formulae for computing test information, standard errors and reliability. We show how to translate a multitude of numbers and graphs conditioned on several dimensions into easy-to-use indices that can be understood by applied researchers and test users alike. All described methods and techniques are illustrated with a single data analysis example involving a popular PRO measure, the 28-item version of the General Health Questionnaire (GHQ28; Goldberg & Williams, 1988), completed in mid-life by a large community sample as a part of a major UK cohort study.

Item Type: Book section
Subjects: B Philosophy. Psychology. Religion > BF Psychology
Divisions: Divisions > Division of Human and Social Sciences > School of Psychology
Depositing User: Anna Brown
Date Deposited: 16 Apr 2014 08:15 UTC
Last Modified: 16 Feb 2021 12:53 UTC
Resource URI: (The current URI for this page, for reference purposes)
Brown, Anna:
  • Depositors only (login required):


Downloads per month over past year