Skip to main content
Kent Academic Repository

Trustworthy Molecular AI: Multi-Dimensional Validation of LLM Generated Chemical Descriptions

Cook, Frederika, Masala, Giovanni Luca (2026) Trustworthy Molecular AI: Multi-Dimensional Validation of LLM Generated Chemical Descriptions. In: Springer Lecture Notes in Computer Science (LNCS). (In press) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:114278)

PDF Author's Accepted Manuscript
Language: English

Restricted to Repository staff only
Contact us about this publication
[thumbnail of paper_111.pdf]
Official URL:
https://www.iccs-meeting.org/iccs2026/

Abstract

Large Language Models (LLMs) offer unprecedented opportunities for molecular property prediction, yet exhibit concerning capabilities for generating plausible-sounding but factually incorrect chemical descriptions. Whilst traditional evaluation metrics measure linguistic similarity, they cannot detect chemically impossible claims. Domain benchmarks focus on entity-level verification rather than systematically validating whether claimed structures are computationally possible. We propose a multi-dimensional validation framework, integrating grammatical fluency and lexical validity checking, RDKit-based computational chemistry validation, and DrugBank verification. Evaluating three architecturally distinct LLMs (MolT5, LLaMA, TxGemma) across multiple decoding strategies on 451 antibiotic compounds, we demonstrate that multi-dimensional validation identifies distinct error types invisible to individual metrics. Our interactive dashboard enables inspection at multiple granularities, transforming validation from post-hoc comparison into evidence-based analysis. Our findings reveal systematic blind spots: perfect grammar accompanies fundamental chemical errors, aggregate metrics mask substantial performance differences, and domain-specific pre-training fails to transfer across chemistry tasks. This work sets foundations for trustworthy AI in chemistry, providing quality assurance infrastructure necessary for deploying LLMs in high-stakes applications that carry material consequences.

Item Type: Conference proceeding
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Institutional Unit: Schools > School of Computing
Former Institutional Unit:
There are no former institutional units.
Funders: University of Kent (https://ror.org/00xkeyj56)
Depositing User: Giovanni Masala
Date Deposited: 01 May 2026 12:17 UTC
Last Modified: 05 May 2026 15:31 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/114278 (The current URI for this page, for reference purposes)

University of Kent Author Information

Masala, Giovanni Luca.

Creator's ORCID: https://orcid.org/0000-0001-6734-9424
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views of this page since July 2020. For more details click on the image.