What Happened?

Using Provenance for Compliance and Verification

Dr. Paul Groth

Provenance enables computational system users to trace how they arrived at a result. This is critical for both business and science. Scientists for example, cannot reproduce, analyse or validate their experiments without provenance. Likewise, businesses must demonstrate their systems' results were produced in a regulatory-compliant manner.

We propose a new definition of provenance specifically suited to the computational model that underpins service-oriented architectures. The provenance of a result is the process that led to that result. Our aim is to conceive a computer-based representation of provenance that allows us to perform useful reasoning about the origin of results. We examine the nature of such computer-based representation, which is articulated around the documentation of process.

We then examine the architecture of a provenance system. This architecture is centered around the notion of a store designed to support the provenance life cycle. It consists of 2 phases:  the recording phase and the reasoning phase. Initially, the process documentation is archived in the store, and subsequently the documentation is reasoned over. We then show how this system can be used for validation in a specific bioinformatics application.

The presentation will draw upon our experience in the PASOA (www.pasoa.org) and EU Provenance (www.gridprovenance.org) projects and will rely on use cases from the domains of bioinformatics, finance, high energy physics, organ transplant management and aerospace engineering.