Skip to main content
Kent Academic Repository

Foundations of Expected Points in Rugby Union: A Methodological Approach

Martinez-Arastey, Guillermo, Datson, Naomi, Smith, Neal, Robins, Matthew (2025) Foundations of Expected Points in Rugby Union: A Methodological Approach. Journal of Sports Analytics, 11 . ISSN 2215-020X. E-ISSN 2215-0218. (doi:10.1177/22150218251365220) (KAR id:110778)

PDF Publisher pdf
Language: English


Download this file
(PDF/1MB)
[thumbnail of martinez-arastey-et-al-2025-foundations-of-expected-points-in-rugby-union-a-methodological-approach.pdf]
Preview
Request a format suitable for use with assistive technology e.g. a screenreader
PDF Author's Accepted Manuscript
Language: English

Restricted to Repository staff only

Contact us about this publication
[thumbnail of Manuscript with author details - post-revisions.pdf]
Official URL:
https://doi.org/10.1177/22150218251365220

Abstract

This study explores the feasibility of an Expected Points metric for rugby union, aiming to shift performance analysis from descriptive indicators to a predictive metric of possession quality. Notational analysis was conducted on 132 Premiership Rugby matches, producing a dataset of 35,199 unique phases of play containing variables such as team in possession, pitch location, type of play, score differences, time remaining, cards and the next scoring outcome. Four

machine learning algorithms were explored to predict scoring outcomes: multinomial logistic regression, random forest, support vector machine and k-nearest neighbors. After extensive feature engineering and hyperparameter optimisation, the best-performing model (a random forest classifier) achieved 39.7% ±2.8 ppts accuracy. However, this did not meet a literature-derived baseline for practical usability (44.3%), thus the model was not suitable for applied contexts. A key

challenge was predicting minority scoring outcomes due to severe class imbalance. SMOTE was explored to address this imbalance, resulting in a lower accuracy (35.7%) but an improved F1-score of 34.4%. This study highlights the inherent limitations of modelling scoring outcomes in dynamic, open-play team sports, challenging the predominant positivist paradigm in sports performance analysis. The methodology provides critical foundational groundwork and a benchmark for future research to build upon. It recommends exploring advanced samplers for minority classes, expanded feature sets and alternative modelling techniques, such as recurrent neural networks.

Item Type: Article
DOI/Identification number: 10.1177/22150218251365220
Uncontrolled keywords: Sports performance analysis, key performance indicators, machine learning, predictive modelling, match analysis
Subjects: G Geography. Anthropology. Recreation > GV Recreation. Leisure > Sports sciences
Institutional Unit: Schools > School of Natural Sciences > Sports and Exercise Science
Former Institutional Unit:
There are no former institutional units.
Funders: University of Kent (https://ror.org/00xkeyj56)
Depositing User: Matthew Robins
Date Deposited: 28 Jul 2025 10:41 UTC
Last Modified: 14 Aug 2025 08:31 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/110778 (The current URI for this page, for reference purposes)

University of Kent Author Information

Robins, Matthew.

Creator's ORCID:
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views of this page since July 2020. For more details click on the image.