ExecutablePapers

From VistrailsWiki
Jump to navigation Jump to search

Introduction

While computational experiments have become an integral part of the scientific method, it is still a challenge to repeat such experiments, because often, computational experiments require specific hardware, non-trivial software installation, and complex manipulations to obtain results. We posit that integrating data acquisition, derivation, analysis, and visualization as executable components throughout the publication process will make it easier to generate and share repeatable results. We have built an extensible infrastructure to support the life-cycle of executable publications---their creation, review and re-use. Our focus is on papers whose computational experiments can be reproduced and validated. We note that our approach is orthogonal to others which focus on semantics and authoring, and can be combined with these.

We have written a paper that details the challenges of computational repeatability and the solutions we have developed: http://www.cs.utah.edu/~juliana/pub/vistrails-executable-paper.pdf

Infrastructure

We have developed a set of techniques to help authors, reviewers, and readers construct and interact with executable papers. Many of these techniques have been integrated with the VisTrails system and the crowdLabs site. VisTrails is an open-source scientific workflow and provenance management system, and our extensions for executable papers allow users to create papers and Web publications whose figures and results are directly tied to the computations that generated them. Both the LaTeX and MediaWiki extensions call VisTrails (either a server or local executable) to embed results into their documents. These capabilities are included in the VisTrails 1.6.2 release available for Windows, Mac, and Linux.

Authors can also choose to couple a publication with the actual computations by embedding links to the workflows in the paper. The embedding wizard in VisTrails can generate these links automatically when computations are hosted on an accessible database. Using a compatible PDF reader (e.g. Adobe Acrobat), readers can click on a result, and the embedded link will access a vtl file that contains the workflow or the necessary information to access a remotely-hosted workflow. VisTrails can open vtl files, and from that file will open the corresponding workflow. Note that you may have to alert your operating system of the file extension association. A reader can then explore the result by changing the input data or original parameters. That reader might choose to publish the modified result to a Web page or another publication.

Demostrations

To see our infrastructure in action, please see the following videos:


Examples of Executable Publications

ALPS2.0

For a real example of an executable paper, whose results can be reproduced and validated, check out the PDF for ALPS2.0 paper at http://arxiv.org/pdf/1101.2646

To repeat the experiments shown in the paper, you will need to download VisTrails from: You can download VisTrails from: http://www.vistrails.org/index.php/Downloads The ALPS2.0 package is included in VisTrails 1.6.

CFD Flow Analysis

Another example of an executable publication, can be found at: http://www.vistrails.org/index.php/User:Tohline/CPM/Levels2and3 Because this paper is published on a Wiki, it is possible to interact with the results using a Web browser. Try it out!

WikiQuery

Here's a case study we did for the SIGMOD Repeatability effort: http://www.cs.utah.edu/~juliana/WikiQuery/WikiQuery_casestudy.pdf

For more details on how to run the WikiQuery experiments, and to obtain the experiments (data, code, workflows) see WikiQuery.