Difference between revisions of "FAQ"

From VistrailsWiki
Jump to navigation Jump to search
 
(107 intermediate revisions by 9 users not shown)
Line 14: Line 14:
'''NOTE:''' If you downloaded the MacOS X bundle, you can run vistrails from the command line via the following commands in the Terminal.  Change the current directory to wherever VisTrails was installed (often <code>/Applications</code>), and then type:
'''NOTE:''' If you downloaded the MacOS X bundle, you can run vistrails from the command line via the following commands in the Terminal.  Change the current directory to wherever VisTrails was installed (often <code>/Applications</code>), and then type:


<code>Vistrails.app/Contents/MacOS/vistrails [<cmd_line_options>]</code>
<code>Vistrails.app/Contents/MacOS/vistrails [<cmd_line_options>]</code> [http://www.vistrails.org/usersguide/v2.1/html/batch.html#running-a-specific-workflow-in-batch-mode Running a Specific Workflow in Batch Mode]


=== Using the command line, we'd like to execute a workflow multiple times, with slightly different parameters, and create a series of output files. Is this possible?===
=== Using the command line, we'd like to execute a workflow multiple times, with slightly different parameters, and create a series of output files. Is this possible?===
Line 24: Line 24:
<code>python vistrails.py -b ../examples/offscreen.vt:offscreen -a"filename=other.png"</code>
<code>python vistrails.py -b ../examples/offscreen.vt:offscreen -a"filename=other.png"</code>


<code>filename</code> in the example above is the alias name assigned to the parameter in the <code>value</code> method inside the String module. When running a pipeline from the command line, VisTrails will try to start the spreadsheet automatically if the pipeline requires it. For example, this other execution will also start the spreadsheet:
<code>filename</code> in the example above is the alias name assigned to the parameter in the <code>value</code> method inside the String module. When running a pipeline from the command line, VisTrails will try to start the spreadsheet automatically if the pipeline requires it. For example, this other execution will also start the spreadsheet (attention to how $ characters are escaped when running on bash):  


<code>python vistrails.py -b ../examples/head.vt:aliases \
<code>python vistrails.py -b ../examples/head.vt:aliases -a"isovalue=30\$&\$diffuse_color=0.8,0.4,0.2"</code>
-a"isovalue=30&Diffuse_Color_R=0.8&Diffuse_Color_G=0.4&Diffuse_Color_B=0.2"</code>


You can also execute more than one pipeline on the command line:
You can also execute more than one pipeline on the command line:
Line 34: Line 33:
-a"isovalue=30"</code>
-a"isovalue=30"</code>


Use the -a parameter only once regardless the number of pipelines.
Use the -a parameter only once regardless the number of pipelines. [http://www.vistrails.org/usersguide/v2.1/html/batch.html#running-a-workflow-with-specific-parameters Running a Workflow with Specific Parameters]


=== I can load a vistrail, and the version tree shows up fine. However, no pipelines appear when I click on a version. What gives? ===
=== I can load a vistrail, and the version tree shows up fine. However, no pipelines appear when I click on a version. What gives? ===
Line 47: Line 46:


Refer to the package documentation for details. The one inconvenient step is that currently there's no automated way to describe what is the missing package. We're working on this feature for future releases.
Refer to the package documentation for details. The one inconvenient step is that currently there's no automated way to describe what is the missing package. We're working on this feature for future releases.
=== I have a workflow that reads a file and then does some processing. The first time it runs, it executes correctly. But in subsequent, nothing happens. ===
VisTrails caches by default, so after a workflow is executed, if none of its parameters change, it won't be executed again.
If a workflow reads a file using the basic module File, VisTrails does check whether the file was modified since the last run. It does so by keeping a signature that is based on the modification time of the file. And if the file was modified, the File module and all downstream modules (the ones which depend on File) will be executed.
''Note: If you would like your input and output data to be versioned, you can use the Persistence package.''
If you do not want VisTrails to cache executions, you can turn off caching: go to Menu Edit -> Preferences and in the General Configuration tab, change Cache execution results to Never. [http://www.vistrails.org/usersguide/v2.1/html/getting_started.html#workflow-execution Workflow Execution]
=== Can VisTrails execute workflows in parallel? ===
The VisTrails server can only execute pipelines in parallel if there's more than one instance of VisTrails running. The command
<code> self.rpcserver = ThreadedXMLRPCServer((self.temp_xml_rpc_options.server, self.temp_xml_rpc_options.port))</code>
starts a multithreaded version of the XML-RPC server, so it will create a thread for each request received by the server. The problem is that Qt/PyQT doesn't allow these multiple threads create GUI objects, only in the main thread. To overcome this limitation, the multithreaded version can instantiate other single threaded versions of VisTrails and put them in a queue, so workflow executions and other GUI-related requests, such as generating workflow graphs and history trees can be forwarded to this queue, and each instance takes turns in answering the request. If the results are in the cache, the multithreaded version answers the requests directly.
Note that this infrastructure works on Linux only. To make this work on Windows, you have to create a script similar to start_vistrails_xvfb.sh (located in the scripts folder) where you can send the number of other instances via command-line options to VisTrails. The command line options are:
python vistrails_server.py -T <ADDRESS> -R <PORT> -O<NUMBER_OF_OTHER_VISTRAILS_INSTANCES> [-M]&
If you want the main vistrails instance to be multithreaded, use the -M at the end.
After creating this script, update function start_other_instances in vistrails/gui/application_server.py lines 1007-1023 and set the script variable to point to your script. You may also have to change the arguments sent to your script (line 1016: for example, you  don't need to set a virtual display).  You will need to change the path to the stop_vistrails_server.py script (on line 1026) according to your installation path. [http://www.vistrails.org/usersguide/v2.1/html/batch.html#executing-workflows-in-parallel Executing Workflows in Parallel]
=== When a workflow is executed, what do the colors mean? ===
- lilac: module was notexecuted
 
- yellow: module is currently being executed
 
- green: module was successfully executed
 
- orange: module was cached
 
- red: the execution of the module failed
[http://www.vistrails.org/usersguide/v2.1/html/getting_started.html#workflow-execution Workflow Execution]
=== Workflow execution hangs on Windows ===
This can happen if you are using "quick edit mode" in the console and have print statements in your code. Standard output can then get blocked by the console. Pressing space in the console resumes the execution. To avoid this problem, either disable "quick edit mode", or avoid print statements in your code.
=== VisTrails do not install Missing System Packages ===
If VisTrails do not try to install missing system packages it may be because it cannot determine your system type. I that case you can run this (in python) to determine your system type:
    import platform
    platform.linux_distribution()
And add this system name to gui/bundles/utils.py by, e.g., modifying the _guess_ubuntu method (if your system is apt-based):
    def _guess_ubuntu():
        return platform.linux_distribution()[0]=='Ubuntu' or \
              platform.linux_distribution()[0]=='YourSystemName'
=== Cannot update subworkflows after upgrading packages or vistrails version ===
When packages used by a subworkflow is upgraded, any subworkflows that use it will be automatically upgraded. It may then lose the ability to be updated to a newer local subworkflow. In this case the subworkflow needs to be updated by hand by removing it from the pipeline and be dragged in again from the module palette. This may get fixed in a future release.


== Building workflows ==
== Building workflows ==
Line 52: Line 113:
=== Is there a way to give each widget a "display name" in addition to the module name at the center of the widget? ===
=== Is there a way to give each widget a "display name" in addition to the module name at the center of the widget? ===


Yes, but it is not easily accessible from the GUI and it definitely needs to be more intuitive. For now, we use the annotation value of key "__desc__" as a module label. If you want to set a PythonSource label, you have to select the module. Then click on the Annotation tab, and add a key named "__desc__", whatever value you set to this key will be the label. We are currently working on a new interface for this functionality.
Yes, a "display name" can be assigned to a module by selecting the triangle in its top right corner to open a popup menu and selecting the Set Module Label... menu item. You will then be prompted to enter the "display name". [http://www.vistrails.org/usersguide/v2.1/html/creating.html#configuring-module-labels Changing Module Labels]


=== Is there a way to re-center the picture-in-picture (PiP) view? ===
=== Is there a way to re-center the picture-in-picture (PiP) view? ===


Yes.  If you click on the PIP window to bring it to focus, you can press Ctrl-R (or Command-R on Mac) to re-center the PiP window.
Yes.  If you click on the PIP window to bring it to focus, you can press Ctrl-R (or Command-R on Mac) to re-center the PiP window. [http://www.vistrails.org/usersguide/v2.1/html/getting_started.html#vistrails-interaction Vistrails Interaction]
 
=== How do I search for a literal "?" (question mark) in the search box in the Property panel? ===
 
Since we allow regular expressions in our search box, question marks are treated as meta-characters. Thus, searching for "?" returns everything and "abc?" will return everything containing "abc".  You need to use "\?" instead to search for "?". So the search for "??" would be "\?\?".  [http://www.vistrails.org/usersguide/v2.1/html/querying.html#textual-queries Textual Queries]
 
=== Saving a vistrail fails when Running VisTrails on Windows inside a Virtual Machine ===
 
After installing Windows in a Virtual Machine, the path to zip.exe may be missing, and you may see this error when trying to save a vistrail:
 
    WindowsError: [Error 2] The system cannot find the file specified: '***/vt.zip'
 
Then you need to add the path to zip.exe, which is included in the binary distribution of VisTrails, to your PATH variable.
 
== Using VisTrails as a server ==
 
=== What is the VisTrails server-mode? ===
 
Using the VisTrails server mode, it is possible to ''execute workflows and control VisTrails through another application''. For example, the CrowdLabs Web portal (http://www.crowdlabs.org) accesses a VisTrails sever to execute workflows, retrieve and display vistrail trees and workflows. [http://www.vistrails.org/usersguide/v2.1/html/batch.html#using-vistrails-as-a-server Using VisTrails as a Server]
 
=== How do I execute workflows and control VisTrails through another application? ===
 
The way you access the server is by doing XML-RPC calls. In the current VisTrails release, we include a set of PHP scripts that can talk to a VisTrails server instance. They are in "extensions/http" folder. The files are reasonably well documented. Also, it should be not difficult to create python scripts to access the server (just use xmlrpclib module).
 
Note that  the VisTrails server requires the provenance and workflows to be in a database. More detailed instructions on how to setup the server and the database
are available here:
 
http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_server_setup.html
 
http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_database_setup.html
 
If what you want is just to execute a series of workflows in batch mode, a simpler solution would be to ''use the VisTrails client in batch mode''. Chapter 12 of the user's guide contains detailed information and examples on that.  [http://www.vistrails.org/usersguide/v2.1/html/batch.html#sec-cli-batch Running VisTrails in Batch Mode]
 
=== VisTrails server executes a workflow but generates a blank image and generates the error message ''cannot get access to X server'' ===
 
You will need to check if the display the server is trying to use is a valid display (by default it uses the display 0).  On linux, the command ''w'' will list the logged users and the display associated with them (''FROM'' column).
 
Note that the VisTrails server requires the machine to be running X.
 
=== cannot get access to X server ===
 
Running VisTrails in '''server or batch mode''' requires a connection to an X server.
 
No additional setup is required if you run VisTrails on a terminal because you are already logged in to X. To make it work in other scenarios, you need to run the python command through Xvfb or make sure you can run cgi scripts that access the GUI.
 
If you can run Xvfb, you can use the following script, where  you need toconfigure the first four variables according to your system: http://www.vistrails.org/images/Run_vistrails_batch_xvfb.script.sh.txt
 
(To run the script, rename the file and remove the ".txt")
 
 
You should also modify yout cgi script to invoke the bash script instead of vistrails directly. The bash script will accept the virtual display, the vistrail file and workflow tag as input arguments.
 
Another possibility is if your workflow does not require the GUI, you can use VisTrails as a regular python module and it will not require the GUI or X Server to run. This functionality is available in the nightly builds and will be included in VisTrails 2.0 beta to be released soon. There is an example of how to use this feature in our FAQ: http://www.vistrails.org/index.php/FAQ#Using_VisTrails_as_a_Python_module
 
== Problems starting VisTrails ==
 
===  Setup was unable to create  the directory "N:\.vistrails" ===
 
When VisTrails is installing, it tries to create the .vistrails folder in the users %HOMEPATH%  directory.  In some Windows installations, network accounts are set to a directory that a user does not have write access to. Consequently, the installation will fail. To get around this problem, you can use the "-S <directory>" flag when starting VisTrails.  This option allows you to put the .vistrails directory wherever you wish.  You could also write a short script that automatically invokes VisTrails with the "-S" flag pointing to a directory that makes sense to your network. If you are unable to install VisTrails, you can run the installer after setting a new home path from the command line like this:
 
set HOMEPATH=\My\New\Home\
set HOMEDRIVE=C:
vistrails-setup-2.0.1-xxx.exe
 
== Using VisTrails as a Python module ==
 
=== Can I use VisTrails as a Python module without installing PyQt? ===
 
Yes! We have improved the ability to use VisTrails from other software, and have eliminated most GUI (PyQt) dependencies in the core part of the code.  Thus, you can now work with workflow versions and provenance information in a standard python shell.  Note packages that directly rely on the GUI like the VisTrails Spreadsheet will still require PyQt to be installed.
 
=== How do I open and execute workflows in a standard python shell? ===
 
Here is a simple example that shows how you can open and execute a workflow from a Python script:
 
>>> import vistrails as vt
>>>
>>> vistrail = vt.load_vistrail('simplemath.vt')
>>> vistrail.select_latest_version()
>>> result = vistrail.execute(in_a=2, in_b=4)
>>> result.output_port('out_plus')
6.0
 
A more complete example is available in the VisTrails distribution as [https://github.com/VisTrails/VisTrails/blob/v2.2/examples/api/ipython-notebook.ipynb examples/api/ipython-notebook.ipynb]
 
== Control Flow ==
 
=== Note: using map  ===
 
When using 'map', the module (or subworkflow) used as function port in the map module MUST be a function, i.e., it can only define 1 output port.  [http://www.vistrails.org/usersguide/v2.1/html/controlflow.html#the-map-operator The Map Operator]


== Spreadsheet ==
== Spreadsheet ==
Line 64: Line 213:
=== How can I save an image from the spreadsheet? ===
=== How can I save an image from the spreadsheet? ===


While having the focus on a spreadsheet cell and select the camera on the toolbar to take a snapshot. The system will prompt you for the location and file name where it should be saved. The other icons can be used for saving multiple images that can be used for generating an animation on demand. A whole sheet can also be saved by selecting Export (either from the menu or from the toolbar).
While having the focus on a spreadsheet cell and select the camera on the toolbar to take a snapshot. The system will prompt you for the location and file name where it should be saved. The other icons can be used for saving multiple images that can be used for generating an animation on demand. A whole sheet can also be saved by selecting Export (either from the menu or from the toolbar). [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#saving-a-spreadsheet-image Saving a Spreadsheet Image]


===Is it possible to save the complete state of the spreadsheet?===
===Is it possible to save the complete state of the spreadsheet?===
[http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#saving-a-spreadsheet Saving a Spreadsheet]


=== Can I view multiple sheets at the same time? ===
=== Can I view multiple sheets at the same time? ===


Yes. Each sheet on the spreadsheet can be displayed as a dock widget separated from the main spreadsheet window by dragging its tab name out of the tab bar at the bottom of the spreadsheet.
Yes. Each sheet on the spreadsheet can be displayed as a dock widget separated from the main spreadsheet window by dragging its tab name out of the tab bar at the bottom of the spreadsheet. [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#multiple-spreadsheets Multiple Spreadsheets]


=== Then, how can I put back a separated sheet? ===
=== Then, how can I put back a separated sheet? ===


A sheet can be docked back to the main window by dragging it back to the tab bar or double-click on its title bar.
A sheet can be docked back to the main window by dragging it back to the tab bar or double-click on its title bar. [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#multiple-spreadsheets Multiple Spreadsheets]


=== How can I order sheets on the spreadsheet? ===
=== How can I order sheets on the spreadsheet? ===


This can be done by dragging the sheet name on the bottom top bar and drop it to the right place.
This can be done by dragging the sheet name on the bottom top bar and drop it to the right place. [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#multiple-spreadsheets Multiple Spreadsheets]


=== Can I control where a cell will be placed on the spreadsheet window? ===
=== Can I control where a cell will be placed on the spreadsheet window? ===


By default, an unoccupied cell on the active sheet will be chosen to display the result. However, you can specify exactly in the pipeline where a spreadsheet cell will be placed by using CellLocation and SheetReference. CellLocation specifies the location (row and column) of a cell when connecting to a spreadsheet cell (VTKCell, ImageViewerCell, ...). Similarly, a SheetReference module (when connecting to a CellLocation) will specify which sheet the cell will be put on given its name, minimum row size and minimum column size. There is an example of this in examples/vtk.xml (select the version below Double Renderer).
By default, an unoccupied cell on the active sheet will be chosen to display the result. However, you can specify exactly in the pipeline where a spreadsheet cell will be placed by using CellLocation and SheetReference. CellLocation specifies the location (row and column) of a cell when connecting to a spreadsheet cell (VTKCell, ImageViewerCell, ...). Similarly, a SheetReference module (when connecting to a CellLocation) will specify which sheet the cell will be put on given its name, minimum row size and minimum column size. There is an example of this in examples/vtk.xml (select the version below Double Renderer). [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#sending-output-to-the-spreadsheet Sending Output to the Spreadsheet]


=== How do I output results to the spreadsheet? ===
=== How do I output results to the spreadsheet? ===
Line 89: Line 239:
can see there are built-in cells for different kinds of data, e.g., RichTextCell to display HTML and plain text. op
can see there are built-in cells for different kinds of data, e.g., RichTextCell to display HTML and plain text. op
You (the user) can also define new cell types to display application-specific data. For example, we have developed
You (the user) can also define new cell types to display application-specific data. For example, we have developed
VtkCell, MplFigureCell, and  OpenGLCell. It is possible to display pretty much anything on the Spreadsheet!
VtkCell, MplFigureCell, and  OpenGLCell. It is possible to display pretty much anything on the Spreadsheet! [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#sending-output-to-the-spreadsheet Sending Output to the Spreadsheet]


Examples of writing cell modules can be found in:
Examples of writing cell modules can be found in:
Line 101: Line 251:
(2) It must re-implement the updateContents() function to take a set of inputs (usually coming from input ports of a wrapper Module) and display on the cells. VisTrails uses this function to update/reuse cells on the spreadsheet when new data comes in.
(2) It must re-implement the updateContents() function to take a set of inputs (usually coming from input ports of a wrapper Module) and display on the cells. VisTrails uses this function to update/reuse cells on the spreadsheet when new data comes in.


(3) It needs a wrapper VisTrails Module (inherited from basic_widgets.SpreadsheetCell of the spreadsheet package). Inside the compute() method of this module, it may call self.display(CellWidgetType, (inputs)) to trigger the display event on the spreadsheet.
(3) It needs a wrapper VisTrails Module (inherited from basic_widgets.SpreadsheetCell of the spreadsheet package). Inside the compute() method of this module, it may call self.display(CellWidgetType, (inputs)) to trigger the display event on the spreadsheet. [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#advanced-cell-options Advanced Cell Options]


=== How do I control the default number of cells in the spreadsheet? ===
=== How do I control the default number of cells in the spreadsheet? ===


You can configure the rowCount and colCount using the preferences dialog. Just go to the Module Packages tab, select spreadsheet in the "Enabled packages" and press the Configure button. Then a list of all the configuration options for the spreadsheet will show up.
You can configure the rowCount and colCount using the preferences dialog. Just go to the Module Packages tab, select spreadsheet in the "Enabled packages" and press the Configure button. Then a list of all the configuration options for the spreadsheet will show up. [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#custom-layout-options Custom Layout Options]
 
=== Is it possible to launch a web browser from the vistrails spreadsheet?  We would like to output several urls from a parameter sweep and then have the option to click on each one to view the resulting page.  I can view the page within the spreadsheet, but it is really too crowded.===
 
Currently, there isn't a widget that provides exactly this functionality, but I can think of a few solutions that may work for you:
 
(1) You can use parameter exploration to generate multiple sheets so you might have an exploration that opens each page in a new sheet.  Use the third column/dimension in the exploration interface to have a parameter span sheets.
 
(2) The spreadsheet is extensible so you can write a custom spreadsheet cell widget that has a button or label with the desired link (a QLabel with openExternalLinks set to True, for example).
 
(3) You can tweak the existing RichTextCell be adding the line "self.browser.setOpenExternalLinks(True)" at line 63 of the source file "vistrails/packages/spreadsheet/widgets/richtext/richtext.py".  Then, if your workflow creates a file with html markup text like "<a href="http://www.vistrails.org/">VisTrails</a>" connected to a RichTextCell, clicking on the rendered link in the cell will open it in a web browser. You need to add the aforementioned line to the source to let Qt know that you want the link opened externally; by default, it will just issue an event that isn't processed.  [http://www.vistrails.org/usersguide/v2.1/html/spreadsheet.html#launching-a-web-browser Launching a Web Browser]


== Integrating your software into VisTrails ==
== Integrating your software into VisTrails ==
Line 111: Line 271:
===How can I integrate my own program into VisTrails?===
===How can I integrate my own program into VisTrails?===


The easiest way is to create a package. Writing a package is often very simple, here are instructions on how to do it: [[UsersGuideVisTrailsPackages]]
The easiest way is to create a package. Writing a package is often very simple; please refer to [http://www.vistrails.org/usersguide/v2.1/html/packages.html this section of the users' guide].
 
You can also dynamically generate modules. For an example see:
 
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#generating-modules-dynamically Generating Modules Dynamically]
 
In particular, see the new_module call which uses python's type() function to generate new classes dynamically.
 
===How do I add a port that is not visible on the module (when it appears on the design canvas)?===
 
This can be accomplished via the "optional" argument.  This is the fourth argument of add_input_port (add_output_port) or can be specified as a kwarg.  In your example, this would look like:
 
reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'), True)
 
or with kwargs
 
reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'),\
                    optional=True)
 
or
 
_input_ports = [('MyPort', '(core.modules.basic_modules.String)', {"optional": True})]
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Configuring Ports]


=== How do modules deal with multiple inputs in a same port? ===
=== How do modules deal with multiple inputs in a same port? ===
Line 119: Line 301:
For compatibility reasons, we do need to allow multiple connections to an input port. However, most package developers should never have to use this, and so we do our best to hide it. the default behavior for getting inputs from a port, then, is to always return a single input.
For compatibility reasons, we do need to allow multiple connections to an input port. However, most package developers should never have to use this, and so we do our best to hide it. the default behavior for getting inputs from a port, then, is to always return a single input.


If on your module you need multiple inputs connected to a single port, use the 'forceGetInputListFromPort' method. It will return a list of all the data items coming through the port. The VTK package uses this feature, so look there for usage examples (packages/vtk/base_widget.py)
If on your module you need multiple inputs connected to a single port, use the 'forceGetInputListFromPort' method. It will return a list of all the data items coming through the port. The spreadsheet package uses this feature, so look there for usage examples (vistrails/packages/spreadsheet/basic_widgets.py)
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Configuring Ports]


=== Are there mechanisms for attaching widgets to different modules/parameters? ===
=== Are there mechanisms for attaching widgets to different modules/parameters? ===
Line 130: Line 313:


  registry.add_module(MyModule, namespace='MyNamespace')
  registry.add_module(MyModule, namespace='MyNamespace')
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-modules Configuring Modules - Hierarchy and Visibility]


=== Can I nest namespaces? ===
=== Can I nest namespaces? ===
Line 137: Line 321:
  registry.add_module(MyModule, namespace='ParentNamespace|ChildNamespace')
  registry.add_module(MyModule, namespace='ParentNamespace|ChildNamespace')


=== Are there shortucts for registry initialization? ===
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-modules Configuring Modules - Hierarchy and Visibility]
 
=== Are there shortcuts for registry initialization? ===


Yes.  If you define _modules as a list of classes in the __init__.py file of your package, VisTrails will attempt to load all classes specified as modules.  You can provide add_module options as keyword arguments by specifying a tuple (class, kwargs) in the list.  For example:
Yes.  If you define _modules as a list of classes in the __init__.py file of your package, VisTrails will attempt to load all classes specified as modules.  You can provide add_module options as keyword arguments by specifying a tuple (class, kwargs) in the list.  For example:
Line 151: Line 337:
     _input_ports = [('firstInput', String), ('secondInput', Integer, True)]
     _input_ports = [('firstInput', String), ('secondInput', Integer, True)]
     _output_ports = [('firstOutput', String), ('secondOutput', String)]
     _output_ports = [('firstOutput', String), ('secondOutput', String)]
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#customizing-modules-and-ports Customizing Modules and Ports]


=== Can I define ports to be of types that I do not import into my package? ===
=== Can I define ports to be of types that I do not import into my package? ===
Line 166: Line 354:


   _input_ports = [('myInputPort', '(edu.utah.sci.vistrails.basic:String)')]
   _input_ports = [('myInputPort', '(edu.utah.sci.vistrails.basic:String)')]
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Configuring Ports - Port Types]


=== What do I need to change in my package to make it reloadable (new in v1.4.2)? ===
=== What do I need to change in my package to make it reloadable (new in v1.4.2)? ===
See [http://www.vistrails.org/index.php/UsersGuideVisTrailsPackages#How_to_make_you_package_reloadable UsersGuideVisTrailsPackages] for an explanation.
See [http://www.vistrails.org/usersguide/v2.1/html/packages.html#creating-reloadable-packages Creating Reloadable Packages] for an explanation.


=== Can I add default values or labels for parameters? ===
=== Can I add default values or labels for parameters? ===


Yes.  Versions 1.4 and greater support these features.  See [[UsersGuideVisTrailsPackages#Adding_default_values_and.2For_labels_for_parameters | UsersGuideVisTrailsPacakges]] for more details.
Yes.  Versions 1.4 and greater support these features.  See [http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Configuring Ports - Default Values and Labels] for more details.
 
=== How can I access the default values for a parameter? ===
 
The default values are stored in PortSpec.defaults for each port.
 
=== I want to write a module to load HDF data whose output (e.g., data, string) varies according to the input I give it. Is is possible to do this in VisTrails, and if yes, how can I do that?  Ideally, I would like to avoid having to change the connection of my output every time I change the input. ===
 
There are a few ways to tackle this - each has it's own benefits and pitfalls.  Firstly, module connections do respect class hierarchies as we're familiar with in object oriented languages.  For instance, A module can output a Constant of which String, Float, Integer, etc are specifications.  In this way, you can have a subclass of something like HDFData be passed out of the module and the connections will be established regardless of the sub-type.  This is a bit dangerous though.  Modules downstream of such a class may not really know how to operate on certain types derived from the super-class.  Extreme care must be taken both when creating the modules as well as connecting them to prevent things like this from happening.
 
A second method that I employ in several different packages is the idea of a container class.  For instance, the NumSciPy package uses a relatively generic container "Numpy Array" to encapsulate the data.  Of course, these encapsulating objects can store dictionaries that other modules can easily access and understand how to operate on.  Although this method is slightly more work, the benefits of a stricter typing of ports is beneficial - particularly upon interfacing with other packages that may depend on strongly typed constants (for example).
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Varying Output According to the Input]
 
=== I need to determine, at run-time, whether or not a "child" module is attached to the output port of a "parent" module. (I do not specifically need to know which child; just if there is one). ===
 
The outputPorts dictionary of the base Module stores this information.  Thus, you should be able to check
 
("myPortName" in self.outputPorts)
 
on the parent module to check if there are any downstream connections from the port "myPortName".  This might be used, for example, to only set results for output ports that will be used.  ***Note***, however, that the caching algorithm assumes that all outputs are set so adding a new connection to a previously unconnected output port will not work as desired if that module is cached.  For this reason, I would currently recommend making such a module not cacheable.  Another possibility is overriding the update() method to check the outputPorts and set the upToDate flag if they are not equal.  In a single, limited test, this seemed to work, but be warned that it is not fully tested.  Here is an example:
 
class TestModule(Module):
    _output_ports = [('a1', '(edu.utah.sci.vistrails.basic:String)'),
                      ('a2', '(edu.utah.sci.vistrails.basic:String)')]
    def __init__(self):
        Module.__init__(self)
        self._cached_output_ports = set()
   
    def update(self):
        if len(set(self.outputPorts) - self._cached_output_ports) > 0:
            self.upToDate = False
        Module.update(self)
   
    def compute(self):
        if "a1" in self.outputPorts:
            self.setResult("a1", "test")
        if "a2" in self.outputPorts:
            self.setResult("a2", "test2")
        self._cached_output_ports = set(self.outputPorts)
 
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-ports Configuring Ports - Determining Whether or Not a Module is Attached to an Output Port]
 
=== How can I make a module not display in the modules list? ===
 
You should set the abstract parameter to True when adding the module to the registry.  Using the original syntax, this looks like:
 
def initialize():
    reg = core.modules.module_registry.get_module_registry()
    reg.add_module(InvisibleModule, abstract=True)
    # ...
 
With the _modules dictionary shortcut (for more details, see [http://www.vistrails.org/index.php/FAQ#Are_there_shortcuts_for_registry_initialization.3F the FAQ section on this]), you include it in a kwargs dict as part of a module tuple:
 
_modules = [AnotherModule, (InvisibleModule, {'abstract': True})]
 
There is also a 'hide_descriptor' parameter that prevents the module from appearing in the module palette without declaring it to be abstract.
 
The technical difference between the two is that 'abstract' will not add the item to the module palette while 'hide_descriptor' does add the item but immediately hides it.  If the module should never be instantiated in a workflow, declare it abstract.  If you don't want users to be able to add the module to a pipeline, but you have code that may add it programmatically, declare it with hide_descriptor=True.
 
[http://www.vistrails.org/usersguide/v2.1/html/packages.html#configuring-modules Configuring Modules - Hierarchy and Visibility]
 
=== How do I document individual ports? ===
 
To access port documentation, users can right-click on the port in the port list and choose the corresponding menu item.  To provide this documentation, you should define the <code>provide_input_port_documentation</code> and/or the <code>provide_output_port_documentation</code> '''class''' methods.  Note that these methods take the class and the port name as arguments.  For example,
 
class MyModule(Module):
    _input_ports = [('test', '(edu.utah.sci.vistrails.basic:String)'),
                    ('test2', '(edu.utah.sci.vistrails.basic:String)')]
    port_docs = {'test': 'Some documentation',
                  'test2': 'More documentation'}
    @classmethod
    def provide_input_port_documentation(cls, port_name):
        return cls.port_docs[port_name]
 
=== How do I access modules from other packages? ===
 
Currently, it is best to access modules from the registry.  First, make sure that any dependencies on another package are specified in <code>package_dependencies</code> method in <code>__init__.py</code>.  To create a module from another package as an output, you can generate it from the registry.  For example,
 
from core.modules.module_registry import get_module_registry
from core.modules.vistrails_module import Module
class ReturnFigManager(Module):
  _output_ports = [('figManager',
                    '(edu.utah.sci.vistrails.matplotlib:MplFigureManager)')]
  def compute(self):
      reg = get_module_registry()
      wrapper = \
          reg.get_descriptor_by_name("edu.utah.sci.vistrails.matplotlib",
                                    "MplFigureManager").module()
      wrapper.figManager = "blah"
      self.setResult('figManager', wrapper)
 
You can also create subclass from classes obtained from the registry.  For example,
 
MplFigureManager = get_module_registry().get_descriptor_by_name(
    "edu.utah.sci.vistrails.matplotlib",
    "MplFigureManager").module
class MplFigureManagerSubclass(MplFigureManager):
    pass
 
 
=== How do I create a custom module configuration widget? ===
See [[Module Configuration Example]] for a full example and notes about doing this.
 
=== Can I make a PythonSource module cacheable? ===
 
Yes.  If you have a module that you are planning to re-use in a workflow, we recommend making a packaged module (which are by default cacheable).  However, you can make a PythonSource (which are by default not cacheable) cacheable using the line
 
self.is_cacheable = lambda *args, **kwargs: True
 
in the source of the PythonSource module.


== The Console ==
== The Console ==
Line 178: Line 479:
=== Where should I go to find out what I can call from the console and how to import it?  ===
=== Where should I go to find out what I can call from the console and how to import it?  ===


We have tried to make some methods more accessible in the console via an api.  You can import the api via <code>import api</code> in the console and see the available methods with <code>dir(api)</code>.  To open a vistrail:
We have tried to make some methods more accessible in the console via an api.  You can import the api via <code>from vistrails import api</code> in the console and see the available methods with <code>dir(api)</code>.  To open a vistrail:


  import api
  from vistrails import api
  api.open_vistrail_from_file('/Applications/VisTrails/examples/terminator.vt')
  api.open_vistrail_from_file('/Applications/VisTrails/examples/terminator.vt')


Line 199: Line 500:
  b = vtk.vtkContourFilter() # adds a vtkContourFilter module to the pipeline and saves to var b
  b = vtk.vtkContourFilter() # adds a vtkContourFilter module to the pipeline and saves to var b
  b.SetInputConnection0(a.GetOutputPort0()) # connects a's GetOutputPort0 port to b's SetInputConnection0
  b.SetInputConnection0(a.GetOutputPort0()) # connects a's GetOutputPort0 port to b's SetInputConnection0
[http://www.vistrails.org/usersguide/v2.1/html/batch.html#finding-methods-via-the-command-line Finding Methods Via the Command Line]
== Persistence Package ==
=== How do I use the output of one workflow as the input for another using the persistence package? ===
You need to configure the persistence modules using the module's configuration dialog.  After adding a <code>PersistentOutputFile</code> to the workflow, click on the triangle in the upper-right corner of the <code>PersistentOutputFile</code>, and select "Edit Configuration" from the menu that appears.  In this dialog, select "Create New Reference" and give the reference a name (and any space-delimited tags).  Upon running that workflow, the data will be written to the persistent store.  In the second workflow where you wish to use that file, add a <code>PersistentInputFile</code> and go to its configuration dialog in the same manner as with the output file.  In that dialog, select "Use Existing Reference" and select the data that you just added in the first workflow in the list of files below.  Now, when you run that workflow, it will grab the data from the persistent store.
Here is an example: [[Media:offscreen_persistent.vt | offscreen_persistent.vt]].  Run the "persistent offscreen" workflow first, and then run the "display persistent output" to use the output of the first workflow as the input for the second.


== VTK ==
== VTK ==
Line 210: Line 520:
To use VTK on VisTrails, you need a slightly different way of connecting the renderer modules. Instead of using the standard RenderWindow/RenderWindowInteractor infrastructure, you simply connect the renderer to a VTKCell. The examples directory in the distribution has several VTK examples that illustrate.
To use VTK on VisTrails, you need a slightly different way of connecting the renderer modules. Instead of using the standard RenderWindow/RenderWindowInteractor infrastructure, you simply connect the renderer to a VTKCell. The examples directory in the distribution has several VTK examples that illustrate.


=== I am trying to add a module to the workfow via Python, but how can I access vtk modules? ===
=== I am trying to add a module to the workflow via Python, but how can I access vtk modules? ===


Here's an example:
Here's an example:


import api
import api
vtvtk = 'edu.utah.sci.vistrails.vtk'
module = api.add_module(0, 0, vtvtk, 'vtkContourFilter', '')
 
 
The third argument in add_module is the package identifier.  You can find this in the "Module Packages" panel of the Preferences; just click on the package you're interested in and it will appear in the information on the right.
 
== matplotlib ==


vtvtk = 'edu.utah.sci.vistrails.vtk'
=== I'm experiencing a problem with Latex labels and the matplotlib that comes with VisTrails 1.5. The script below entered to the interpreter that comes with VT is sufficient to reproduce it. ===


module = api.add_module(0, 0, vtvtk, 'vtkContourFilter', '')
  import matplotlib.pyplot as plt
  plt.plot([1,2,3],[1,2,3])
  plt.xlabel("$foo$")


Remove your ~/.matplotlib folder and re-start VisTrails


The third argument in add_module is the package identifierYou can find this in the "Module Packages" panel of the Preferences; just click on the package you're interested in and it will appear in the information on the right.
== rpy ==
 
=== Package rpy fails with "module object has no attribute RVector" ===
 
The rpy package needs to be updated to support a newer rpy version. In "packages/rpy/init.py", replace all instances of "objects.RVector" with "objects.Vector", or use [https://raw.github.com/VisTrails/VisTrails/d0711a8d5ba8992662f69ed7924c625a3658b6a7/vistrails/packages/rpy/init.py this] file.
 
== JobSubmission ==
 
The JobSubmission package depends on the stable version of BatchQ. Download https://github.com/troelsfr/BatchQ/archive/stable.zip, copy the "BatchQ-stable/batchq" directory to your local site-packages folder. Copying it to the "vistrails/packages/JobSubmission" folder should also work. See batchq/contrib/vistrails for examples.
 
== VisTrails Development ==
 
=== I would like to build VisTrails from source. Are there instructions on how to do this? ===
 
Yes! Take a look at [http://www.vistrails.org/usersguide/v2.1/html/getting_started.html#installing-vistrails-from-source Installing VisTrails from source]
 
== Accessing Provenance Information ==
 
=== How do I access the information in the execution log? ===
 
The code responsible for storing execution information is located in the "core/log" directories, and the code that generates much of that information is in "core/interpreter/cached.py"Modules can add execution-specific annotations to provenance via annotate() calls during execution, but much of the data (like timing and errors) is captured by the LogController and CachedInterpreter (the execution engine) objects.  To analyze the log from a vistrail (.vt) file, you might have something like the following:
 
<code>
  import core.log.log
  import db.services.io
 
  def run(fname):
  # open the .vt bundle specified by the filename "fname"
  bundle = db.services.io.open_vistrail_bundle_from_zip_xml(fname)[0]
  # get the log filename
  log_fname = bundle.vistrail.db_log_filename
  if log_fname is not None:
      # open the log
      log = db.services.io.open_log_from_xml(log_fname, True)
      # convert the log from a db object
      core.log.log.Log.convert(log)
      for workflow_exec in log.workflow_execs:
          print 'workflow version:', workflow_exec.parent_version
          print 'time started:', workflow_exec.ts_start
          print 'time ended:', workflow_exec.ts_end
          print 'modules executed:', [i.module_id
                                      for i in workflow_exec.item_execs]
  if __name__ == '__main__':
    run("some_vistrail.vt")
</code>
 
You should be able to see what information is available by looking at the "core/log" classes.  [http://www.vistrails.org/usersguide/v2.1/html/log.html Accessing the Execution Log]
 
== VisTrails Binaries ==
=== Is there a Mac OS X 10.6+ x64 binary of the version 1.7 of VisTrails available?  ===
 
We don't have a 64bit Mac binary for v1.7 release because at the time we didn't have 64 bit versions of the libraries shipped in the 1.7 binary.
 
However, it is possible to update a 64bit or any other binary with a source release of VisTrails, including the sources of 1.7 version or the nightly builds.
 
Assuming you have the sources of 1.7 in /vistrails1.7 and the 64bit binary in /Applications/VisTrails1.7 do the following steps:
  cp /vistrails1.7/vistrails/vistrails.py  /Applications/VisTrails1.7/VisTrails.app/Contents/Resources
  cp -r /vistrails1.7/vistrails/api /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/core /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/db /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/gui /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/packages /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/examples /Applications/VisTrails1.7/
  cp -r /vistrails1.7/extensions /Applications/VisTrails1.7/
  cp -r /vistrails1.7/scripts /Applications/VisTrails1.7/
 
=== AVG Antivirus falsely report a virus in VisTrails 2.0.2 32-bit Windows installer ===
 
Problematic file is vistrails/Python27/Lib/site-packages/_mysql.pyd:
https://www.virustotal.com/en/file/d8aabd921b5eba8aabcce936ce3b92e3d1de43eb44c43d921ca1b9ab91d7fd81/analysis/1366640335/. This is most likely a false positive and can be ignored.
 
=== VisTrails fails after upgrading to OSX 10.9 ===
 
Reinstalling [http://xquartz.macosforge.org/ XQuartz] should solve the problem.

Latest revision as of 13:59, 26 May 2017

Also check our Known Issues page for troubleshooting.


Running workflows

How can I run a workflow using the command line?

(Updated for version 1.2) Call vistrails using the following options:

python vistrails.py -b path_to_vistrails_file:pipeline

where pipeline can be a version tag name or version id

NOTE: If you downloaded the MacOS X bundle, you can run vistrails from the command line via the following commands in the Terminal. Change the current directory to wherever VisTrails was installed (often /Applications), and then type:

Vistrails.app/Contents/MacOS/vistrails [<cmd_line_options>] Running a Specific Workflow in Batch Mode

Using the command line, we'd like to execute a workflow multiple times, with slightly different parameters, and create a series of output files. Is this possible?

(Updated for version 1.2) We can change parameters that have an alias through the command line.

For example, offscreen pipeline in offscreen.vt always creates the file called image.png. If you want generate it with a different filename:

python vistrails.py -b ../examples/offscreen.vt:offscreen -a"filename=other.png"

filename in the example above is the alias name assigned to the parameter in the value method inside the String module. When running a pipeline from the command line, VisTrails will try to start the spreadsheet automatically if the pipeline requires it. For example, this other execution will also start the spreadsheet (attention to how $ characters are escaped when running on bash):

python vistrails.py -b ../examples/head.vt:aliases -a"isovalue=30\$&\$diffuse_color=0.8,0.4,0.2"

You can also execute more than one pipeline on the command line:

python vistrails.py -b ../examples/head.vt:aliases ../examples/spx.vt:spx \ -a"isovalue=30"

Use the -a parameter only once regardless the number of pipelines. Running a Workflow with Specific Parameters

I can load a vistrail, and the version tree shows up fine. However, no pipelines appear when I click on a version. What gives?

The most likely reason is that the vistrail uses a package that is not registered with VisTrails. You need to identify the needed package and add it to your .vistrails/startup.py. A single line like the following should be enough:

addPackage('enter_package_name_here')

Some packages might need more information. For example:

addPackage('afront', executable_path='/path/to/afront')

Refer to the package documentation for details. The one inconvenient step is that currently there's no automated way to describe what is the missing package. We're working on this feature for future releases.

I have a workflow that reads a file and then does some processing. The first time it runs, it executes correctly. But in subsequent, nothing happens.

VisTrails caches by default, so after a workflow is executed, if none of its parameters change, it won't be executed again.

If a workflow reads a file using the basic module File, VisTrails does check whether the file was modified since the last run. It does so by keeping a signature that is based on the modification time of the file. And if the file was modified, the File module and all downstream modules (the ones which depend on File) will be executed.


Note: If you would like your input and output data to be versioned, you can use the Persistence package.

If you do not want VisTrails to cache executions, you can turn off caching: go to Menu Edit -> Preferences and in the General Configuration tab, change Cache execution results to Never. Workflow Execution

Can VisTrails execute workflows in parallel?

The VisTrails server can only execute pipelines in parallel if there's more than one instance of VisTrails running. The command

self.rpcserver = ThreadedXMLRPCServer((self.temp_xml_rpc_options.server, self.temp_xml_rpc_options.port))

starts a multithreaded version of the XML-RPC server, so it will create a thread for each request received by the server. The problem is that Qt/PyQT doesn't allow these multiple threads create GUI objects, only in the main thread. To overcome this limitation, the multithreaded version can instantiate other single threaded versions of VisTrails and put them in a queue, so workflow executions and other GUI-related requests, such as generating workflow graphs and history trees can be forwarded to this queue, and each instance takes turns in answering the request. If the results are in the cache, the multithreaded version answers the requests directly.

Note that this infrastructure works on Linux only. To make this work on Windows, you have to create a script similar to start_vistrails_xvfb.sh (located in the scripts folder) where you can send the number of other instances via command-line options to VisTrails. The command line options are:

python vistrails_server.py -T <ADDRESS> -R <PORT> -O<NUMBER_OF_OTHER_VISTRAILS_INSTANCES> [-M]&

If you want the main vistrails instance to be multithreaded, use the -M at the end.

After creating this script, update function start_other_instances in vistrails/gui/application_server.py lines 1007-1023 and set the script variable to point to your script. You may also have to change the arguments sent to your script (line 1016: for example, you don't need to set a virtual display). You will need to change the path to the stop_vistrails_server.py script (on line 1026) according to your installation path. Executing Workflows in Parallel

When a workflow is executed, what do the colors mean?

- lilac: module was notexecuted

- yellow: module is currently being executed

- green: module was successfully executed

- orange: module was cached

- red: the execution of the module failed

Workflow Execution

Workflow execution hangs on Windows

This can happen if you are using "quick edit mode" in the console and have print statements in your code. Standard output can then get blocked by the console. Pressing space in the console resumes the execution. To avoid this problem, either disable "quick edit mode", or avoid print statements in your code.

VisTrails do not install Missing System Packages

If VisTrails do not try to install missing system packages it may be because it cannot determine your system type. I that case you can run this (in python) to determine your system type:

   import platform
   platform.linux_distribution()

And add this system name to gui/bundles/utils.py by, e.g., modifying the _guess_ubuntu method (if your system is apt-based):

   def _guess_ubuntu():
       return platform.linux_distribution()[0]=='Ubuntu' or \
              platform.linux_distribution()[0]=='YourSystemName'

Cannot update subworkflows after upgrading packages or vistrails version

When packages used by a subworkflow is upgraded, any subworkflows that use it will be automatically upgraded. It may then lose the ability to be updated to a newer local subworkflow. In this case the subworkflow needs to be updated by hand by removing it from the pipeline and be dragged in again from the module palette. This may get fixed in a future release.

Building workflows

Is there a way to give each widget a "display name" in addition to the module name at the center of the widget?

Yes, a "display name" can be assigned to a module by selecting the triangle in its top right corner to open a popup menu and selecting the Set Module Label... menu item. You will then be prompted to enter the "display name". Changing Module Labels

Is there a way to re-center the picture-in-picture (PiP) view?

Yes. If you click on the PIP window to bring it to focus, you can press Ctrl-R (or Command-R on Mac) to re-center the PiP window. Vistrails Interaction

How do I search for a literal "?" (question mark) in the search box in the Property panel?

Since we allow regular expressions in our search box, question marks are treated as meta-characters. Thus, searching for "?" returns everything and "abc?" will return everything containing "abc". You need to use "\?" instead to search for "?". So the search for "??" would be "\?\?". Textual Queries

Saving a vistrail fails when Running VisTrails on Windows inside a Virtual Machine

After installing Windows in a Virtual Machine, the path to zip.exe may be missing, and you may see this error when trying to save a vistrail:

   WindowsError: [Error 2] The system cannot find the file specified: '***/vt.zip'

Then you need to add the path to zip.exe, which is included in the binary distribution of VisTrails, to your PATH variable.

Using VisTrails as a server

What is the VisTrails server-mode?

Using the VisTrails server mode, it is possible to execute workflows and control VisTrails through another application. For example, the CrowdLabs Web portal (http://www.crowdlabs.org) accesses a VisTrails sever to execute workflows, retrieve and display vistrail trees and workflows. Using VisTrails as a Server

How do I execute workflows and control VisTrails through another application?

The way you access the server is by doing XML-RPC calls. In the current VisTrails release, we include a set of PHP scripts that can talk to a VisTrails server instance. They are in "extensions/http" folder. The files are reasonably well documented. Also, it should be not difficult to create python scripts to access the server (just use xmlrpclib module).

Note that the VisTrails server requires the provenance and workflows to be in a database. More detailed instructions on how to setup the server and the database are available here:

http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_server_setup.html

http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_database_setup.html

If what you want is just to execute a series of workflows in batch mode, a simpler solution would be to use the VisTrails client in batch mode. Chapter 12 of the user's guide contains detailed information and examples on that. Running VisTrails in Batch Mode

VisTrails server executes a workflow but generates a blank image and generates the error message cannot get access to X server

You will need to check if the display the server is trying to use is a valid display (by default it uses the display 0). On linux, the command w will list the logged users and the display associated with them (FROM column).

Note that the VisTrails server requires the machine to be running X.

cannot get access to X server

Running VisTrails in server or batch mode requires a connection to an X server.

No additional setup is required if you run VisTrails on a terminal because you are already logged in to X. To make it work in other scenarios, you need to run the python command through Xvfb or make sure you can run cgi scripts that access the GUI.

If you can run Xvfb, you can use the following script, where you need toconfigure the first four variables according to your system: http://www.vistrails.org/images/Run_vistrails_batch_xvfb.script.sh.txt

(To run the script, rename the file and remove the ".txt")


You should also modify yout cgi script to invoke the bash script instead of vistrails directly. The bash script will accept the virtual display, the vistrail file and workflow tag as input arguments.

Another possibility is if your workflow does not require the GUI, you can use VisTrails as a regular python module and it will not require the GUI or X Server to run. This functionality is available in the nightly builds and will be included in VisTrails 2.0 beta to be released soon. There is an example of how to use this feature in our FAQ: http://www.vistrails.org/index.php/FAQ#Using_VisTrails_as_a_Python_module

Problems starting VisTrails

Setup was unable to create the directory "N:\.vistrails"

When VisTrails is installing, it tries to create the .vistrails folder in the users %HOMEPATH% directory. In some Windows installations, network accounts are set to a directory that a user does not have write access to. Consequently, the installation will fail. To get around this problem, you can use the "-S <directory>" flag when starting VisTrails. This option allows you to put the .vistrails directory wherever you wish. You could also write a short script that automatically invokes VisTrails with the "-S" flag pointing to a directory that makes sense to your network. If you are unable to install VisTrails, you can run the installer after setting a new home path from the command line like this:

set HOMEPATH=\My\New\Home\
set HOMEDRIVE=C:
vistrails-setup-2.0.1-xxx.exe

Using VisTrails as a Python module

Can I use VisTrails as a Python module without installing PyQt?

Yes! We have improved the ability to use VisTrails from other software, and have eliminated most GUI (PyQt) dependencies in the core part of the code. Thus, you can now work with workflow versions and provenance information in a standard python shell. Note packages that directly rely on the GUI like the VisTrails Spreadsheet will still require PyQt to be installed.

How do I open and execute workflows in a standard python shell?

Here is a simple example that shows how you can open and execute a workflow from a Python script:

>>> import vistrails as vt
>>>
>>> vistrail = vt.load_vistrail('simplemath.vt')
>>> vistrail.select_latest_version()
>>> result = vistrail.execute(in_a=2, in_b=4)
>>> result.output_port('out_plus')
6.0

A more complete example is available in the VisTrails distribution as examples/api/ipython-notebook.ipynb

Control Flow

Note: using map

When using 'map', the module (or subworkflow) used as function port in the map module MUST be a function, i.e., it can only define 1 output port. The Map Operator

Spreadsheet

Where pipeline is a version number or a tag.

How can I save an image from the spreadsheet?

While having the focus on a spreadsheet cell and select the camera on the toolbar to take a snapshot. The system will prompt you for the location and file name where it should be saved. The other icons can be used for saving multiple images that can be used for generating an animation on demand. A whole sheet can also be saved by selecting Export (either from the menu or from the toolbar). Saving a Spreadsheet Image

Is it possible to save the complete state of the spreadsheet?

Saving a Spreadsheet

Can I view multiple sheets at the same time?

Yes. Each sheet on the spreadsheet can be displayed as a dock widget separated from the main spreadsheet window by dragging its tab name out of the tab bar at the bottom of the spreadsheet. Multiple Spreadsheets

Then, how can I put back a separated sheet?

A sheet can be docked back to the main window by dragging it back to the tab bar or double-click on its title bar. Multiple Spreadsheets

How can I order sheets on the spreadsheet?

This can be done by dragging the sheet name on the bottom top bar and drop it to the right place. Multiple Spreadsheets

Can I control where a cell will be placed on the spreadsheet window?

By default, an unoccupied cell on the active sheet will be chosen to display the result. However, you can specify exactly in the pipeline where a spreadsheet cell will be placed by using CellLocation and SheetReference. CellLocation specifies the location (row and column) of a cell when connecting to a spreadsheet cell (VTKCell, ImageViewerCell, ...). Similarly, a SheetReference module (when connecting to a CellLocation) will specify which sheet the cell will be put on given its name, minimum row size and minimum column size. There is an example of this in examples/vtk.xml (select the version below Double Renderer). Sending Output to the Spreadsheet

How do I output results to the spreadsheet?

By inspecting the VisTrails Spreadsheet package (in the list of packages, to the left of the pipeline builder), you can see there are built-in cells for different kinds of data, e.g., RichTextCell to display HTML and plain text. op You (the user) can also define new cell types to display application-specific data. For example, we have developed VtkCell, MplFigureCell, and OpenGLCell. It is possible to display pretty much anything on the Spreadsheet! Sending Output to the Spreadsheet

Examples of writing cell modules can be found in: RichTextCell: packages/spreadsheet/widgets/richtext/richtext.py VTK: packages/vtk/vtkcell.py

Here is the summary of some requirements on a cell widget:

(1) It must be a Qt widget. It should inherit from spreadsheet_cell.QCellWidget in the spreadsheet package. Although any Qt Widget would work, certain features such as animation will not be available (without rewriting it).

(2) It must re-implement the updateContents() function to take a set of inputs (usually coming from input ports of a wrapper Module) and display on the cells. VisTrails uses this function to update/reuse cells on the spreadsheet when new data comes in.

(3) It needs a wrapper VisTrails Module (inherited from basic_widgets.SpreadsheetCell of the spreadsheet package). Inside the compute() method of this module, it may call self.display(CellWidgetType, (inputs)) to trigger the display event on the spreadsheet. Advanced Cell Options

How do I control the default number of cells in the spreadsheet?

You can configure the rowCount and colCount using the preferences dialog. Just go to the Module Packages tab, select spreadsheet in the "Enabled packages" and press the Configure button. Then a list of all the configuration options for the spreadsheet will show up. Custom Layout Options

Is it possible to launch a web browser from the vistrails spreadsheet? We would like to output several urls from a parameter sweep and then have the option to click on each one to view the resulting page. I can view the page within the spreadsheet, but it is really too crowded.

Currently, there isn't a widget that provides exactly this functionality, but I can think of a few solutions that may work for you:

(1) You can use parameter exploration to generate multiple sheets so you might have an exploration that opens each page in a new sheet. Use the third column/dimension in the exploration interface to have a parameter span sheets.

(2) The spreadsheet is extensible so you can write a custom spreadsheet cell widget that has a button or label with the desired link (a QLabel with openExternalLinks set to True, for example).

(3) You can tweak the existing RichTextCell be adding the line "self.browser.setOpenExternalLinks(True)" at line 63 of the source file "vistrails/packages/spreadsheet/widgets/richtext/richtext.py". Then, if your workflow creates a file with html markup text like "<a href="http://www.vistrails.org/">VisTrails</a>" connected to a RichTextCell, clicking on the rendered link in the cell will open it in a web browser. You need to add the aforementioned line to the source to let Qt know that you want the link opened externally; by default, it will just issue an event that isn't processed. Launching a Web Browser

Integrating your software into VisTrails

How can I integrate my own program into VisTrails?

The easiest way is to create a package. Writing a package is often very simple; please refer to this section of the users' guide.

You can also dynamically generate modules. For an example see:

Generating Modules Dynamically

In particular, see the new_module call which uses python's type() function to generate new classes dynamically.

How do I add a port that is not visible on the module (when it appears on the design canvas)?

This can be accomplished via the "optional" argument. This is the fourth argument of add_input_port (add_output_port) or can be specified as a kwarg. In your example, this would look like:

reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'), True)

or with kwargs

reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'),\
                   optional=True)

or

_input_ports = [('MyPort', '(core.modules.basic_modules.String)', {"optional": True})]

Configuring Ports

How do modules deal with multiple inputs in a same port?

(And should that even be allowed?)

For compatibility reasons, we do need to allow multiple connections to an input port. However, most package developers should never have to use this, and so we do our best to hide it. the default behavior for getting inputs from a port, then, is to always return a single input.

If on your module you need multiple inputs connected to a single port, use the 'forceGetInputListFromPort' method. It will return a list of all the data items coming through the port. The spreadsheet package uses this feature, so look there for usage examples (vistrails/packages/spreadsheet/basic_widgets.py) Configuring Ports

Are there mechanisms for attaching widgets to different modules/parameters?

Right now, we have a mechanism for putting a specific widget for an input port. For example, if a port is SetColor(red, green, blue), we can put a color wheel widget there. Or we can also replace the SetFileName port with a File Widget. However, this is not per parameter (only per port). We are currently working on this problem.

Can I organize my package so it appears hierarchical in the module palette?

Yes. Use the namespace keyword argument when adding the module to the registry. For example,

registry.add_module(MyModule, namespace='MyNamespace')

Configuring Modules - Hierarchy and Visibility

Can I nest namespaces?

Yes. Use the '|' character to separate different the hierarchy. For example,

registry.add_module(MyModule, namespace='ParentNamespace|ChildNamespace')

Configuring Modules - Hierarchy and Visibility

Are there shortcuts for registry initialization?

Yes. If you define _modules as a list of classes in the __init__.py file of your package, VisTrails will attempt to load all classes specified as modules. You can provide add_module options as keyword arguments by specifying a tuple (class, kwargs) in the list. For example:

_modules = [MyModule1, (MyModule2, {'namespace': 'MyNamespace'})]

In addition, you need to identify the ports of your modules as a field in your class by defining _input_ports and _output_ports lists. Here, the items in each list must be tuples of the form (portName, portSignature, optional=False, sort_key=-1). For example:

class MyModule(Module):
    def compute(self):
       pass

   _input_ports = [('firstInput', String), ('secondInput', Integer, True)]
   _output_ports = [('firstOutput', String), ('secondOutput', String)]

Customizing Modules and Ports

Can I define ports to be of types that I do not import into my package?

Yes. You can pass an identifier string as the portSignature instead. The port_signature string is defined by:

<module_string> := <package_identifier>:[<namespace>|]<module_name>,
<port_signature> := (<module_string>*)

For example,

registry.add_input_port(MyModule, 'myInputPort', '(edu.utah.sci.vistrails.basic:String)')

or

 _input_ports = [('myInputPort', '(edu.utah.sci.vistrails.basic:String)')]

Configuring Ports - Port Types

What do I need to change in my package to make it reloadable (new in v1.4.2)?

See Creating Reloadable Packages for an explanation.

Can I add default values or labels for parameters?

Yes. Versions 1.4 and greater support these features. See Configuring Ports - Default Values and Labels for more details.

How can I access the default values for a parameter?

The default values are stored in PortSpec.defaults for each port.

I want to write a module to load HDF data whose output (e.g., data, string) varies according to the input I give it. Is is possible to do this in VisTrails, and if yes, how can I do that? Ideally, I would like to avoid having to change the connection of my output every time I change the input.

There are a few ways to tackle this - each has it's own benefits and pitfalls. Firstly, module connections do respect class hierarchies as we're familiar with in object oriented languages. For instance, A module can output a Constant of which String, Float, Integer, etc are specifications. In this way, you can have a subclass of something like HDFData be passed out of the module and the connections will be established regardless of the sub-type. This is a bit dangerous though. Modules downstream of such a class may not really know how to operate on certain types derived from the super-class. Extreme care must be taken both when creating the modules as well as connecting them to prevent things like this from happening.

A second method that I employ in several different packages is the idea of a container class. For instance, the NumSciPy package uses a relatively generic container "Numpy Array" to encapsulate the data. Of course, these encapsulating objects can store dictionaries that other modules can easily access and understand how to operate on. Although this method is slightly more work, the benefits of a stricter typing of ports is beneficial - particularly upon interfacing with other packages that may depend on strongly typed constants (for example). Varying Output According to the Input

I need to determine, at run-time, whether or not a "child" module is attached to the output port of a "parent" module. (I do not specifically need to know which child; just if there is one).

The outputPorts dictionary of the base Module stores this information. Thus, you should be able to check

("myPortName" in self.outputPorts)

on the parent module to check if there are any downstream connections from the port "myPortName". This might be used, for example, to only set results for output ports that will be used. ***Note***, however, that the caching algorithm assumes that all outputs are set so adding a new connection to a previously unconnected output port will not work as desired if that module is cached. For this reason, I would currently recommend making such a module not cacheable. Another possibility is overriding the update() method to check the outputPorts and set the upToDate flag if they are not equal. In a single, limited test, this seemed to work, but be warned that it is not fully tested. Here is an example:

class TestModule(Module):
    _output_ports = [('a1', '(edu.utah.sci.vistrails.basic:String)'),
                     ('a2', '(edu.utah.sci.vistrails.basic:String)')]
    def __init__(self):
        Module.__init__(self)
        self._cached_output_ports = set()
    
    def update(self):
        if len(set(self.outputPorts) - self._cached_output_ports) > 0:
            self.upToDate = False
        Module.update(self)
    
    def compute(self):
        if "a1" in self.outputPorts:
            self.setResult("a1", "test")
        if "a2" in self.outputPorts:
            self.setResult("a2", "test2")
        self._cached_output_ports = set(self.outputPorts)

Configuring Ports - Determining Whether or Not a Module is Attached to an Output Port

How can I make a module not display in the modules list?

You should set the abstract parameter to True when adding the module to the registry. Using the original syntax, this looks like:

def initialize():
    reg = core.modules.module_registry.get_module_registry()
    reg.add_module(InvisibleModule, abstract=True)
    # ...

With the _modules dictionary shortcut (for more details, see the FAQ section on this), you include it in a kwargs dict as part of a module tuple:

_modules = [AnotherModule, (InvisibleModule, {'abstract': True})]

There is also a 'hide_descriptor' parameter that prevents the module from appearing in the module palette without declaring it to be abstract.

The technical difference between the two is that 'abstract' will not add the item to the module palette while 'hide_descriptor' does add the item but immediately hides it. If the module should never be instantiated in a workflow, declare it abstract. If you don't want users to be able to add the module to a pipeline, but you have code that may add it programmatically, declare it with hide_descriptor=True.

Configuring Modules - Hierarchy and Visibility

How do I document individual ports?

To access port documentation, users can right-click on the port in the port list and choose the corresponding menu item. To provide this documentation, you should define the provide_input_port_documentation and/or the provide_output_port_documentation class methods. Note that these methods take the class and the port name as arguments. For example,

class MyModule(Module):
    _input_ports = [('test', '(edu.utah.sci.vistrails.basic:String)'),
                    ('test2', '(edu.utah.sci.vistrails.basic:String)')]
    port_docs = {'test': 'Some documentation',
                 'test2': 'More documentation'}
    @classmethod
    def provide_input_port_documentation(cls, port_name):
        return cls.port_docs[port_name]

How do I access modules from other packages?

Currently, it is best to access modules from the registry. First, make sure that any dependencies on another package are specified in package_dependencies method in __init__.py. To create a module from another package as an output, you can generate it from the registry. For example,

from core.modules.module_registry import get_module_registry
from core.modules.vistrails_module import Module

class ReturnFigManager(Module):
 _output_ports = [('figManager', 
                   '(edu.utah.sci.vistrails.matplotlib:MplFigureManager)')]
 def compute(self):
     reg = get_module_registry()
     wrapper = \
         reg.get_descriptor_by_name("edu.utah.sci.vistrails.matplotlib", 
                                    "MplFigureManager").module()
     wrapper.figManager = "blah"
     self.setResult('figManager', wrapper)

You can also create subclass from classes obtained from the registry. For example,

MplFigureManager = get_module_registry().get_descriptor_by_name(
    "edu.utah.sci.vistrails.matplotlib", 
    "MplFigureManager").module
class MplFigureManagerSubclass(MplFigureManager):
    pass


How do I create a custom module configuration widget?

See Module Configuration Example for a full example and notes about doing this.

Can I make a PythonSource module cacheable?

Yes. If you have a module that you are planning to re-use in a workflow, we recommend making a packaged module (which are by default cacheable). However, you can make a PythonSource (which are by default not cacheable) cacheable using the line

self.is_cacheable = lambda *args, **kwargs: True

in the source of the PythonSource module.

The Console

Where should I go to find out what I can call from the console and how to import it?

We have tried to make some methods more accessible in the console via an api. You can import the api via from vistrails import api in the console and see the available methods with dir(api). To open a vistrail:

from vistrails import api
api.open_vistrail_from_file('/Applications/VisTrails/examples/terminator.vt')

To execute a version of a workflow, you currently have to go through the controller:

api.select_version('Histogram')
api.get_current_controller().execute_current_workflow()

Currently, only a subset of VisTrails functionality is directly available from the api. However, since VisTrails is written in python, you can dig down starting with the VistrailsApplication or controller object to expose most of our internal methods. If you have suggestions for calls to be added to the api, please let us know.

One other feature that we're working on, but is still in progress is the ability to construct workflows via the console. For example:

vtk = load_package('edu.utah.sci.vistrails.vtk')
vtk.vtkDataSetReader() # adds a vtkDataSetReader module to the pipeline
# click on the new module
a = selected_modules()[0] # get the one currently selected module
a.SetFile('/vistrails/examples/data/head120.vtk') # sets the SetFile parmaeter for the data set reader
b = vtk.vtkContourFilter() # adds a vtkContourFilter module to the pipeline and saves to var b
b.SetInputConnection0(a.GetOutputPort0()) # connects a's GetOutputPort0 port to b's SetInputConnection0

Finding Methods Via the Command Line

Persistence Package

How do I use the output of one workflow as the input for another using the persistence package?

You need to configure the persistence modules using the module's configuration dialog. After adding a PersistentOutputFile to the workflow, click on the triangle in the upper-right corner of the PersistentOutputFile, and select "Edit Configuration" from the menu that appears. In this dialog, select "Create New Reference" and give the reference a name (and any space-delimited tags). Upon running that workflow, the data will be written to the persistent store. In the second workflow where you wish to use that file, add a PersistentInputFile and go to its configuration dialog in the same manner as with the output file. In that dialog, select "Use Existing Reference" and select the data that you just added in the first workflow in the list of files below. Now, when you run that workflow, it will grab the data from the persistent store.

Here is an example: offscreen_persistent.vt. Run the "persistent offscreen" workflow first, and then run the "display persistent output" to use the output of the first workflow as the input for the second.

VTK

Given a VTK visualization, how can I generate a webpage from it?

Check out the html pipeline in offscreen.xml.

I'm trying to use VTK, but there doesn't seem to be any output. What is wrong?

To use VTK on VisTrails, you need a slightly different way of connecting the renderer modules. Instead of using the standard RenderWindow/RenderWindowInteractor infrastructure, you simply connect the renderer to a VTKCell. The examples directory in the distribution has several VTK examples that illustrate.

I am trying to add a module to the workflow via Python, but how can I access vtk modules?

Here's an example:

import api

vtvtk = 'edu.utah.sci.vistrails.vtk'

module = api.add_module(0, 0, vtvtk, 'vtkContourFilter', )


The third argument in add_module is the package identifier. You can find this in the "Module Packages" panel of the Preferences; just click on the package you're interested in and it will appear in the information on the right.

matplotlib

I'm experiencing a problem with Latex labels and the matplotlib that comes with VisTrails 1.5. The script below entered to the interpreter that comes with VT is sufficient to reproduce it.

  import matplotlib.pyplot as plt
  plt.plot([1,2,3],[1,2,3])
  plt.xlabel("$foo$")

Remove your ~/.matplotlib folder and re-start VisTrails

rpy

Package rpy fails with "module object has no attribute RVector"

The rpy package needs to be updated to support a newer rpy version. In "packages/rpy/init.py", replace all instances of "objects.RVector" with "objects.Vector", or use this file.

JobSubmission

The JobSubmission package depends on the stable version of BatchQ. Download https://github.com/troelsfr/BatchQ/archive/stable.zip, copy the "BatchQ-stable/batchq" directory to your local site-packages folder. Copying it to the "vistrails/packages/JobSubmission" folder should also work. See batchq/contrib/vistrails for examples.

VisTrails Development

I would like to build VisTrails from source. Are there instructions on how to do this?

Yes! Take a look at Installing VisTrails from source

Accessing Provenance Information

How do I access the information in the execution log?

The code responsible for storing execution information is located in the "core/log" directories, and the code that generates much of that information is in "core/interpreter/cached.py". Modules can add execution-specific annotations to provenance via annotate() calls during execution, but much of the data (like timing and errors) is captured by the LogController and CachedInterpreter (the execution engine) objects. To analyze the log from a vistrail (.vt) file, you might have something like the following:

 import core.log.log
 import db.services.io
 def run(fname):
  # open the .vt bundle specified by the filename "fname"
  bundle = db.services.io.open_vistrail_bundle_from_zip_xml(fname)[0]
  # get the log filename
  log_fname = bundle.vistrail.db_log_filename
  if log_fname is not None:
      # open the log
      log = db.services.io.open_log_from_xml(log_fname, True)
      # convert the log from a db object
      core.log.log.Log.convert(log)
      for workflow_exec in log.workflow_execs:
          print 'workflow version:', workflow_exec.parent_version
          print 'time started:', workflow_exec.ts_start
          print 'time ended:', workflow_exec.ts_end
          print 'modules executed:', [i.module_id 
                                      for i in workflow_exec.item_execs]
 if __name__ == '__main__':
    run("some_vistrail.vt")

You should be able to see what information is available by looking at the "core/log" classes. Accessing the Execution Log

VisTrails Binaries

Is there a Mac OS X 10.6+ x64 binary of the version 1.7 of VisTrails available?

We don't have a 64bit Mac binary for v1.7 release because at the time we didn't have 64 bit versions of the libraries shipped in the 1.7 binary.

However, it is possible to update a 64bit or any other binary with a source release of VisTrails, including the sources of 1.7 version or the nightly builds.

Assuming you have the sources of 1.7 in /vistrails1.7 and the 64bit binary in /Applications/VisTrails1.7 do the following steps:

  cp /vistrails1.7/vistrails/vistrails.py  /Applications/VisTrails1.7/VisTrails.app/Contents/Resources
  cp -r /vistrails1.7/vistrails/api /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/core /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/db /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/gui /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/packages /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/examples /Applications/VisTrails1.7/
  cp -r /vistrails1.7/extensions /Applications/VisTrails1.7/
  cp -r /vistrails1.7/scripts /Applications/VisTrails1.7/

AVG Antivirus falsely report a virus in VisTrails 2.0.2 32-bit Windows installer

Problematic file is vistrails/Python27/Lib/site-packages/_mysql.pyd: https://www.virustotal.com/en/file/d8aabd921b5eba8aabcce936ce3b92e3d1de43eb44c43d921ca1b9ab91d7fd81/analysis/1366640335/. This is most likely a false positive and can be ignored.

VisTrails fails after upgrading to OSX 10.9

Reinstalling XQuartz should solve the problem.