An important aspect of provenance is the description of the subject. Subject provenance includes birth and death dates (for post-mortem studies), in addition to the age of the subject at the time of the data collection (or death). Sex and species are captured, further qualified by strain and genetic manipulation in the case of non-human subjects. Treatments, such as disease induction in experimental models, drug treatment, and combinations of treatments can be documented in the schema. Subject name has explicitly been excluded in order to protect patient privacy (http://www.hhs.gov/ocr/index.html), SubjectID standing in as an unique identifier for a given subject. These elements are extensible, allowing for multiple treatments or clinical evaluations. Subject provenance has been described in a simple, yet flexible format in order to make it easily accessible to the community with a minimum of work to adapt it for specialized use (Appendix A). However, the subject provenance is defined in its own independent XSD, making it easily modified or even replaced with a definition that is better suited to a specific kind of study, such as the XCEDE data exchange schema (Keator et al., 2006) for functional MRI (fMRI) studies.
The description of how a set of data was acquired is of critical importance for data provenance. Crucial elements of the acquisition provenance are captured by extracting that information from the image header or by requesting information from the user. Different information is required from the user based on the kind of data acquired. For example, when collecting acquisition provenance about an MRI image, information about the acquisition type (2D vs. 3D), weighting (proton density, T1, T2, etc.), pulse sequence, flip angle, echo time (TE), repetition time (TR), inversion time (TI), matrix dimensions, step sizes, magnet field strength, coil used, equipment manufacturer and model are explicitly captured in the XSD. These elements are far from exhaustive, but are easily expanded and/or extended to accommodate other imaging modalities from diffusion tensor imaging (DTI) to positron emission tomography (PET) (http://www.loni.usc.edu/Software/NI_Provenance).