Difference between revisions of "Users Guide"
Line 74: | Line 74: | ||
differ in their specifications), their relationships, and their | differ in their specifications), their relationships, and their | ||
instances (which differ in the parameters used in each particular | instances (which differ in the parameters used in each particular | ||
execution). | execution). VisTrails uses a change-based model to capture provenance. | ||
VisTrails uses | |||
As the scientist makes modifications to a particular dataflow, the | As the scientist makes modifications to a particular dataflow, the | ||
provenance mechanism records those changes. | provenance mechanism records those changes. |
Revision as of 06:22, 21 January 2007
There is not much documentation right now, but some of the documentation for an older version of the system is still useful: [1]. Also, our older webpage has links to a number of papers that contain information on the various features of the system: [2].
Alternatively, you can subscribe to the vistrails users mailing list. Details are on how to do that, are available here.
Getting_Started
VisTrails is available on Windows XP, Mac OS X, and Linux. These versions all have the same functionality and only differ in user interface as noted throughout this document.
There are different download options, available here. It is substantially easier to start with a binary version, and this is encouraged for first-time users. If you decided on a source version (maybe because a binary version for your architecture is not available at this time), please follow the instructions on building the software from source available here.
Starting up the binary version is system depended. On Windows XP and Mac OS X, it requires clicking on the application icon. To start the binary version on any system, you should change directory to "src/vistrails/trunk/vistrails/", where the "vistrails.py" file is available. You can start VisTrails with the following command: "python vistrails.py -l".
Depending on a number of factors, it can take a few seconds for the system to start up. You will see a splash screen while that happens. On the console, you will see some messages that show the packages being loaded. On my Mac OS X system, I get the following:
Initializing vtk Initializing pythonCalc Initializing spreadsheet Loading Spreadsheet widgets... ==> Successfully import <Basic Widgets> ==> Successfully import <Image Viewer> ==> Successfully import <VTK Viewer> ==> Successfully import <HTML Viewer> ==> Successfully import <SVG Widgets>
Also, I get two separate windows, the VisTrails Builder:
And the VisTrails Spreadsheet:
You are now ready to load a vistrail inside the system. Go to the Builder, and under "File",
there will be an "Open" option. After clicking it, you will be giving a list of files, and you can
load any of the vistrails there. For instance, if you load the "vtk_book_3rd_p189.xml", your
screen will look like this:
Each "oval" correspond to a different workflow. If you click on "final", you can "execule" that workflow either by clicking on the execute the workflow icon: .
More details on interacting with the components of VisTrails are available below.
VisTrails Modules
VisTrails Builder
You can create and edit dataflows (workflows) using the Vistrail Builder user interface. The dataflow specifications are saved in a repository that can be either local or remote. For now, we only discuss local repository here, which are "xml" files by default.
A key feature of VisTrails is the support for full provenance of the exploration process. For this, we introduced the notion of a visual trail, or a vistrail. A vistrail captures the evolution of a dataflow---all steps followed to construct a set of workflows. It represents several versions of a dataflow (which differ in their specifications), their relationships, and their instances (which differ in the parameters used in each particular execution). VisTrails uses a change-based model to capture provenance. As the scientist makes modifications to a particular dataflow, the provenance mechanism records those changes. Instead of storing a set of related dataflows, VisTrails stores the operations or changes that are applied to the dataflows, e.g., the addition of a module, the modification of a parameter, etc.
This representation is both simple and compact---it uses substantially less space than the alternative of storing multiple versions of a dataflow. In addition, it enables the construction of an intuitive interface that allows scientists to both understand and interact with the history of the dataflow through these changes. A tree-based view allows a scientist to return to a previous version in an intuitive way; to undo bad changes; to compare different dataflows; and be reminded of the actions that led to a particular result.