Martinez-Arastey, Guillermo, Datson, Naomi, Smith, Neal, Robins, Matthew (2025) Foundations of Expected Points in Rugby Union: A Methodological Approach. Journal of Sports Analytics, 11 . ISSN 2215-020X. E-ISSN 2215-0218. (doi:10.1177/22150218251365220) (KAR id:110778)
|
PDF
Publisher pdf
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
|
Download this file (PDF/1MB) |
Preview |
| Request a format suitable for use with assistive technology e.g. a screenreader | |
|
PDF
Author's Accepted Manuscript
Language: English Restricted to Repository staff only
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
|
Contact us about this publication
|
|
| Official URL: https://doi.org/10.1177/22150218251365220 |
|
Abstract
This study explores the feasibility of an Expected Points metric for rugby union, aiming to shift performance analysis from descriptive indicators to a predictive metric of possession quality. Notational analysis was conducted on 132 Premiership Rugby matches, producing a dataset of 35,199 unique phases of play containing variables such as team in possession, pitch location, type of play, score differences, time remaining, cards and the next scoring outcome. Four
machine learning algorithms were explored to predict scoring outcomes: multinomial logistic regression, random forest, support vector machine and k-nearest neighbors. After extensive feature engineering and hyperparameter optimisation, the best-performing model (a random forest classifier) achieved 39.7% ±2.8 ppts accuracy. However, this did not meet a literature-derived baseline for practical usability (44.3%), thus the model was not suitable for applied contexts. A key
challenge was predicting minority scoring outcomes due to severe class imbalance. SMOTE was explored to address this imbalance, resulting in a lower accuracy (35.7%) but an improved F1-score of 34.4%. This study highlights the inherent limitations of modelling scoring outcomes in dynamic, open-play team sports, challenging the predominant positivist paradigm in sports performance analysis. The methodology provides critical foundational groundwork and a benchmark for future research to build upon. It recommends exploring advanced samplers for minority classes, expanded feature sets and alternative modelling techniques, such as recurrent neural networks.
| Item Type: | Article |
|---|---|
| DOI/Identification number: | 10.1177/22150218251365220 |
| Uncontrolled keywords: | Sports performance analysis, key performance indicators, machine learning, predictive modelling, match analysis |
| Subjects: | G Geography. Anthropology. Recreation > GV Recreation. Leisure > Sports sciences |
| Institutional Unit: | Schools > School of Natural Sciences > Sports and Exercise Science |
| Former Institutional Unit: |
There are no former institutional units.
|
| Funders: | University of Kent (https://ror.org/00xkeyj56) |
| Depositing User: | Matthew Robins |
| Date Deposited: | 28 Jul 2025 10:41 UTC |
| Last Modified: | 14 Aug 2025 08:31 UTC |
| Resource URI: | https://kar.kent.ac.uk/id/eprint/110778 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):

Altmetric
Altmetric