Skip to main content

Provenance-Aware CXXR

Silles, Christopher Anthony (2014) Provenance-Aware CXXR. Doctor of Philosophy (PhD) thesis, University of Kent,. (KAR id:50499)

PDF
Language: English
Download (2MB) Preview
[img]
Preview

Abstract

A provenance-aware computer system is one that records information about the operations it performs on data to enable it to provide an account of the process that led to a particular item of data. These systems allow users to ask questions of data, such as “What was the sequence of steps involved in its creation?”, “What other items of data were used to create it?”, or “What items of data used it during their creation?”. This work will present a study of how, and the extent to which the CXXR statistical programming software can be made aware of the provenance of the data on which it operates. CXXR is a variant of the R programming language and environment, which is an open source implementation of S. Interestingly S is notable for becoming an early pioneer of provenance-aware computing in 1988. Examples of adapting software such as CXXR for provenance-awareness are few and far between, and the idiosyncrasies of an interpreter such as CXXR—moreover the R language itself—present interesting challenges to provenance-awareness: such as receiving input from a variety of sources and complex evaluation mechanisms. Herein presented are designs for capturing and querying provenance information in such an environment, along with serialisation facilities to preserve data together with its provenance so that they may be distributed and/or subsequently restored to a CXXR session. Also presented is a method for enabling this serialised provenance information to be interoperable with other provenance-aware software. This work also looks at the movement towards making research reproducible, and considers that provenance-aware systems, and provenance-aware CXXR in particular, are well positioned to further the goal of making computational research reproducible.

Item Type: Thesis (Doctor of Philosophy (PhD))
Thesis advisor: Runnalls, Andrew
Uncontrolled keywords: provenance provenance-aware software adaptation PROV lineage audit reproducible research reproducibility statistical computing R CXXR
Subjects: Q Science > QA Mathematics (inc Computing science)
Divisions: Faculties > Sciences > School of Computing
Depositing User: Users 1 not found.
Date Deposited: 16 Sep 2015 11:00 UTC
Last Modified: 29 May 2019 16:00 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/50499 (The current URI for this page, for reference purposes)
  • Depositors only (login required):

Downloads

Downloads per month over past year