Skip to main content
Kent Academic Repository

3D latent diffusion model for MR-only radiotherapy: accurate and consistent synthetic CT generation

Mahdi, Mohammed A., Al-Shalabi, Mohammed, Alnfrawy, Ehab T., Elbarougy, Reda, Hadi, Muhammad Usman, Ali, Rao Faizan (2025) 3D latent diffusion model for MR-only radiotherapy: accurate and consistent synthetic CT generation. Diagnostics, 15 (23). Article Number 3010. ISSN 2075-4418. (doi:10.3390/diagnostics15233010) (KAR id:112289)

Abstract

Background: The clinical imperative to reduce patient ionizing radiation exposure during diagnosis and treatment planning necessitates robust, high-fidelity synthetic imaging solutions. Current cross-modal synthesis techniques, primarily based on GANs and deterministic CNNs, exhibit instability and critical errors in modeling high-contrast tissues, thereby hindering their reliability for safety-critical applications such as radiotherapy. Objectives: Our primary objective was to develop a stable, high accuracy framework for 3D Magnetic Resonance Imaging (MRI) to Computed Tomography (CT) synthesis capable of generating clinically equivalent synthetic CTs (sCTs) across multiple anatomical sites. Methods: We introduce a novel 3D Latent Diffusion Model (3DLDM) that operates in a compressed latent space, mitigating the computational burden of 3D diffusion while leveraging the stability of the denoising objective. Results: Across the Head &amp; Neck, Thorax, and Abdomen, the 3DLDM achieved a Mean Absolute Error (MAE) of 56.44 Hounsfield Units (HU). This result demonstrates a significant 3.63% reduction in overall error compared to the strongest adversarial baseline, CycleGAN (MAE = 60.07 HU, p &lt; 0.05), a 10.76% reduction compared to NNUNet (MAE = 67.20 HU, p < 0.01), and a 20.79% reduction compared to the transformer-based SwinUNeTr (MAE = 77.23 HU, p < 0.0001). The model also achieved the highest structural similarity (SSIM = 0.885 ± 0.031), significantly exceeding SwinUNeTr (p < 0.0001), NNUNet (p &lt; 0.01), and Pix2Pix (p < 0.0001). Likewise, the 3D-LDM achieved the highest peak signal-to-noise ratio (PSNR = 29.73 ± 1.60 dB), with statistically significant gains over CycleGAN (p < 0.01), NNUNet (p < 0.001), and SwinUNeTr (p < 0.0001). Conclusions: This work validates a scalable, accurate approach for volumetric synthesis, positioning the 3DLDM to enable MR-only radiotherapy planning and accelerate radiation-free multi-modal imaging in the clinic.

Item Type: Article
DOI/Identification number: 10.3390/diagnostics15233010
Uncontrolled keywords: latent diffusion models; medical image synthesis; MRI-to-CT translation; 3D volumetric imaging; generative models; synthetic CT; radiotherapy planning
Subjects: H Social Sciences
Institutional Unit: Schools > School of Computing
Former Institutional Unit:
There are no former institutional units.
SWORD Depositor: JISC Publications Router
Depositing User: JISC Publications Router
Date Deposited: 08 Dec 2025 11:57 UTC
Last Modified: 10 Dec 2025 14:23 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/112289 (The current URI for this page, for reference purposes)

University of Kent Author Information

Ali, Rao Faizan.

Creator's ORCID: https://orcid.org/0000-0003-0701-6761
CReDIT Contributor Roles: Conceptualisation, Writing - original draft, Supervision
  • Depositors only (login required):

Total unique views of this page since July 2020. For more details click on the image.