Development
Roadmap
Weekly Meetings
Dec 7, 2010
- Question: do we display 'warnings' or only errors?
- Question: Should v1_0_1 MySql schemas be upgraded to v1_0_2?
- Some suggestions/requests by Matthias:
- Support the grouping of vistrails into projects
- Ability to set global parameters/bookmarks=workflows (e.g., a filename or path)
- Archive results -- create a big tar/zip file with all vt and results
- Ability to add a vt: if a node is deleted, all data associated with that should also be deleted
- Create a test suite: run a workflow, save and archive results (user could specify which results should be saved)
- Data/provenance browser
- We need a better mechanism that allows users to search/query for workflows and provenance, for example search by project, parameter names and values
- One way to view this is to allow the creation of mappings between the vistrails/projects into different structures---like "smart folders", e.g., /projectN/vt1/wf3/exec23/O1....
- Support editing and synchronization in both ways (if edit smart folder, should update vt)
- Should be able to script the use of the browser, e.g., to test a new version of a workflow against a saved test suite
- On merging master and crowdlabs git branches
- some updates on the crowdlabs branch need to be merged/patched into the master branch
- how should this be done?
- PythonSource editor
- modal or multiple window editors?
- Unit tests
- [ES]: I ran the test suite on current master branch and 207 tests are executed of which 11 fail. Some of the tests are failing because console_mode_test package can be imported anymore (this is a package created specific to run console_mode tests).
- Most of the other problems are issues with
InvalidPipeline: Pipeline cannot be instantiated: Missing version 0 of package <package>
- Data publishing with CrowdLabs
- Using git to manage data between local and server data stores might be too messy.
- Does one create a new git repository for each vistrail on crowdlabs?
- The local persistent git repo holds all persistent data used on that machine. How does one push/publish just one data product using git?
- Can anybody think of a plausible git workflow to do this?
- Perhaps we should just look into adding directory support for RepoSync module along with a better data management UI.
- Are there OS independent file transfer protocols better than HTTP? Something like rsync might work nicely, but windows support is limited.
- Using git to manage data between local and server data stores might be too messy.
Nov 30, 2010
- Subworkflow Upgrades (Daniel)
- Should be completed finished tonight
- Local subworkflow upgrades should be working now
- Need to test these after the checkin
- Using a dummy controller
- Assistant for control package (Daniel)
- Look at examples
- Try to make an assistant to modify workflow using currently selected modules as looping group
- Improve error handling/reporting
- Tons of message boxes pop up when a workflow has a bunch of errors, UI response is bad after this...
- Update existing error messages
- Show Details rendering on Mac
- What to do when multiple errors occur in a single event?
- New dialog that shows "next message" but allows user to dismiss messages or view all in the messages window
- Color coding for messages: use gui/theme.py to configure: maybe try black for warning, gray for log level messages
- Data and workflow browser (Nivan will give a status update)
- Saving Explorations (Feature Request from ALPS)
- We already save parameter explorations with the workflow in the version tree. As is, we save only the last exploration that was done for a given workflow. If you ship the vistrail, the saved parameter explorations will be there. Is this sufficient?
- [MT] Yes, that takes us a long way. Could one add a button in the explorations to save a certain set of explorations as a new version, like you do with the camera angle in the spreadsheet? That way one could store more than one set of exploration parameters without forcing a new version to be created with every change?
- Eventually we should provide a selector in parameter exploration
- Fastest to just allow a null action new version (a bit clunky) but wouldn't require too many changes
- Clean up/prune persistent directory
- What are the concrete use cases for this?
- [MT] Here are two concrete use cases that appeared over the last week:
- a project is done, the important results copied to an archive and I want to get rid of all its intermediate files, etc., to save space - but not those files associated with other projects. Or, similarly, I'm done testing some workflows, want to remove all (partially buggy or irrelevant) files, and then start with production runs.
- there was a problem in a workflow(e.g. a bug in HTTPFile module with binary files or a bug in some user code). I want to force a certain persisted directory to be "forgotten", so that I can repair the wrong files. Bumping the version number of the ALPS package would be one way, but that is overkill since it would immediately invalidate all persisted files. Removing the whole persistence directory is just as radical.
- just testing simple tutorials with small files leaves me with 2 GB of persistent data after 10 days, in 142 persistent directories or files. I need some way to clean this up that is less radical then removing all persistent directories and files. This could be done, by e.g. removing all those which do not occur in tagged or leaf workflows in a certain set of vistrails. Or marking those which occur in certain vistrails with certain labels, and then being able to prune all those with or without some label/tag. Another option might be having the ability to choose from various repositories.
- [DK] follow up on this
- Improvements to VisTrails server communication with clients to better handle errors/crashes (Phillip)
- work in progress
- Data Publishing with crowdLabs. (Phillip)
- Do we want data publishing capabilities in VisTrails and crowdLabs? YES!!!
- If so...
- Currently limited data provenance for data. HTTPFile module lacks meta data, RepoSync module has poor usability.
- Persistent Data works for experimental data. Can we use it for publishing data?
- Publishing data or storing data on the server and pushing back and forth?
- What is the current scheme for identifying data on the server?
- Do we need to do better than the ids, versions, hashes that the peristence package provides?
- Do we want to utilize data publishing software like The Dataverse Network or maybe Fedora Commons?
- [ES] Feature request: Can RepoSync support a directory of files?
- [PM] It can't, and I'm not sure how add it. I also don't really like the current RepoSync interface. It's not obvious how your data is being handled, hence me bring up data publishing, so hopefully we can come up with something more usable/robust.
- persistence used more for exploratory
- want to focus on more archive
- move from exploratory data to published
- Dataverse: essentially a site to curate data, archive data, host their own site, each university has their own server, linked together
- persistent store integration: maybe use git to do this, can add to local repositories to move to git
- use some of the dataverse: standard hash schemes, etc.
- privacy: can users push workflows without exposing data
- permissions for who can check things out: download permissions, upload permissions
- look into setting up persistent store on
- DisplayWall (Wendel)
- new version of vistrails receives rotate/scale messages much more slowly
- using the vtk version that is already there
- comparing VisTrails 1.2 with nightly src
- test when the messages are received on the clients
- [ES] Changed something in the VTKCell, using the Qt interactor from vtk now, just uncomment this (vtkcell.py file)
- some commands are missed; check why this is happening
- preferences switch? 2x2 not 2x3
- check what modifications exist in the client code that's not exactly 1.2?
- new version of vtk package changed some things
- update vtk upgrade: resetcamera and addrgbpoint => _# versions
- Demos on displaywall: Wendel has to keep both VisTrails running and stop work when visitors stop by unannounced
- Relative paths in workflows, how to deal with them? (Emanuele)
- [ES] Updated all examples to use HTTPFile for these
- some of these had to change completely (color widget), kept tree the same format
Nov 16, 2010
- Data and workflow browser (Nivan)
- Check into git branch (Dave)
- Add data support to the vistrails, workflows, and execs already exposed in the browser
- Add thumbnails to the browser
- Bittorrent support?
- Upgrade issues -- a wrap? (Daniel)
- Fixing some issues with namespaces
- Upgrades for subworkflows that are included with packages, need to save these upgrades with the vt file as a "local abstraction" to ensure provenance. May be able to tie these back to the package versions once the package is upgraded.
- Test Suite
- Matthias has his own test suite now
- We should make sure that our tests are up-to-date and run them before releases
- Web services on users' guide (Tommy)
- Users' guide needs only a few changes. One of the changes should be a note making it explicit that we are using a new library and the old one was deprecated.
- Do we want to update the 39-page guide as well? For now, just update the users' guide. Don't distribute this guide. Just include it in a zip file that includes the old version.
- Improve error handling/reporting (Tommy)
- print statements, has a new window
- add a menu item to view message window and also keep all messages from the session
- look into writing QMessageBox errors and warning to message window
- write all messages to a single window, write critical messages in red?
- have a utility method that encapsulates the gui error display and the core.debug display (if not running gui, only write to core.debug)
- don't display stack trace in a qmessagebox? -- try to emulate the invalid pipeline with "Show Details" button.
- make it easier for users to report errors by having a "copy" or "send to developers" button to report the details of an error.
- DisplayWall (Wendel)
- What examples should be in the SVN? If we decide to keep some examples in the svn, some of them use data that should be along.
- Indicate progress of HTTPFile
- Had to upgrade examples to work with new examples of VisTrails
- VisTrails can send progress information with another thread like crowdlabs can do
- Each machine has a .vistrails and needs to download own version of data, make a single .vistrails for all machines?
- All slave machines can use the server as a cache?
- DisplayWall Client on the Apple Store? I dont like the idea of sharing the client source code. And putting the app without login/password will allow anyone to access our displaywall.
- Add a password for this
- Do user management?
- What examples should be in the SVN? If we decide to keep some examples in the svn, some of them use data that should be along.
- VTK package:
- Have two ResetCamera modules now
- Need to write upgrade code
- Only with vtk 5.7 right now
- Encode library version in the package version?
- For reproducibility, we probably want to stick to specific versions
- We still need some ability to run new versions of vtk for testing, new features, etc., too...
- Relative paths in workflows, how to deal with them? (Emanuele)
- Should we use dataDirectory?
- Change examples to not use hard-coded files?
- Problem with workflows, not data files
- On Windows, use ../examples
- On Mac, use ../../../examples
- dataDirectory not used, I think
- create "$VT_EXAMPLES"?
- problem is using new versions of VisTrails, data may change so referencing old location will fail
- don't allow users to change this variable
- convert to use HTTPFile
- Data files and the GIT Repository (Emanuele)
- make a separate repository or use the submodule git functionality
- go with submodule
- Next release: What still needs to be done?
- high priority from ALPS list
- Dan's fixes to the upgrade workflow
- Date for release? Official release in beginning of December for ALPS
- All developers should run test suite on push
- Could make this somewhat automatic, run with nightly release script and email -dev list
- Could have different classes of tests, make sure that critical ones pass
- Tests have a bunch of errors currently
- Saving explorations:
- Keeps track of the latest exploration for each version
- Check with Matthias what the requirements for this are
- Have way to send just specification to a server (run this workflow with this parameter exploration)?
- Can we have an xml spec that specifies the parameter exploration (export parameter exploration)
- Currently persisting via xml and then converting to string so we already have an xml serialization
- Dan wrote API that takes xml string and populates the param_exp gui with the appropriate values
- Need to back the GUI state with core state to enable short-circuiting the gui
- Box input ports should be colored (e.g. black) if they are already set by an internal method. Another color (e.g. red) could be used to indicate mandatory input ports.
- Currently, non-optional ports are square, shown. Optionals are circles and must be enabled
- Add a third category for mandatory, problem is backward compat.
- [ES] Colors are already overloaded, not for ports, use shading instead?
Nov 9, 2010
- FAQ
- As we reply to users queries, let's add the question and answer to the FAQ!
- Just a reminder
- Caching of File module
- There was a message from a user who got confused because this module is cached by default (and silently). And in his application, since the file actually changed in between runs, he did not see the 'expected' result.
- Should we keep a hash for files and check whether they changed?
- Should we no cache File by default?
- [DK] File is supposed to have a special signature computation that detects changes in the file contents so this may be a bug instead. It does work, it seems to be an issue with the create_file
- [ES] In this particular case, the file was empty. He deleted the file outside VisTrails and executed it again. As he set create_file to True, he was expecting the file to be created again.
- [DK] Add random salt to hash signature when the file doesn't exist and create_file is True
- [JF: DONE] Juliana will edit the FAQ to note that the signatures are dependent on modification times
- Claurissa will demo different ways to visualize the version tree
- weighting on various criteria for displaying nodes (importance from session, user, tags, etc.)
- other modes for viewing the versions (lists or timelines)
- often see only linear trees, teaching or thinking other than tree?
- Upgrade issues (Dan)
- Vanilla upgrades should be in today's commit
- Added code so when doing upgrade to check whether the latest code is upgraded
- Two types of upgrades: (1) User changes the subworkflow, (2) Modules are out-of-date and need to be upgraded.
- Add higher-level features like "Upgrade All" for (1), "Import Changes to My Subworkflows", etc.
- Want a 1-1 mapping between unique subworkflow id and a subworkflow file (even if we copy the same subworkflow to different vistrails)
- Offscreen bug (update from Huy)
- Mac issue when switching to Cocoa
- Huy added a QWidget for each of the render windows that is created, so should be fixed.
- Gesture support
- Someone on Windows check the pinch gesture support; it doesn't work for Matthias under Parallels
- VTKCell is working now
- Cell dependent, panning now maps to middle button, pinch on the base widget
- What are Matthias's requirements here? Does touch gestures in the cells matter or only in the vt/wf views
- Thumbnails: [JF: this needs to go to trac, for 2.0?] [ES: 2 tickets were created for 2.0]
- Resolution? -- [ES] problem is the size of the vistrail, but this can be configured
- Can we export to PDF instead of PNG? [ES] cannot display the thumbnail as a PDF in Qt?
- Can we have the option to save higher-res or PDF versions of the thumbnails
- Can we have preferences that allow users to save a higher-res or PDF version, maybe a checkbox
- Maybe associate a high-res version with a spreadsheet cell so that we can save a high-res version on demand
- high-res version associated with a version but allow user to initiate action from spreadsheet
- can we have thumbnails dependent on camera position [ES] doesn't like, user won't see exactly that image upon re-executing
- compact vistrail option by removing thumbnails
- Tommy has rewritten the Web services module as well as updated all of the examples
- Server for one of the examples seems to be down
- Need to update the manual
- Improve error handling/reporting (Tommy)
- Maybe display the intialization messages with the splash screen?
- Debugging levels have no way to access debug level
- Tommy takes first cut at trying to determine how to map print statements, notes those he is unsure of
- Developer guidelines for debug usage, use them instead of print statements in the futur
- Merge functionality (Tommy) [JF: Cool!]
- Can now merge an existing vistrail into your own vistrail using menu item
- Improvements to VisTrails server communication with clients to better handle errors/crashes (Phillip)
- [PM] I haven't had a chance to implement anything yet.
- Data Publishing with crowdLabs. (Phillip)
- Do we want data publishing capabilities in VisTrails and crowdLabs? YES!!!
- If so...
- Currently limited data provenance for data. HTTPFile module lacks meta data, RepoSync module has poor usability.
- Persistent Data works for experimental data. Can we use it for publishing data?
- Publishing data or storing data on the server and pushing back and forth?
- What is the current scheme for identifying data on the server?
- Do we need to do better than the ids, versions, hashes that the peristence package provides?
- Do we want to utilize data publishing software like The Dataverse Network or maybe Fedora Commons?
- [ES] Feature request: Can RepoSync support a directory of files?
- [PM] It can't, and I'm not sure how add it. I also don't really like the current RepoSync interface. It's not obvious how your data is being handled, hence me bring up data publishing, so hopefully we can come up with something more usable/robust.
Nov 3, 2010
- Upgrading subworkflows (Dan)
- add updated versions to subworkflow vistrail and push to registry
- fixed bug with version/descriptor redirects
- key remaining task is to replace the box representing the subworkflow in the top-level workflow to reflect the change to the upgraded version of the subworkflow
- many corner cases, but we should get the vanilla upgrade (working) to git
- need to ensure that each subworkflow file has its own uuid; if a subworkflow exists in two different vistrails, it should have a different uuid (namespace).
- otherwise, we can get crosstalk where one vt can update the subworkflow from another vt (we can propgate changes to other vts via merges, but this should be a user choice...)
- enhancements:
- allow users to merge subworkflows from a file into their own subworkflow (to incorporate outside changes)
- latest version is the most recent non-upgraded version; if that version has an upgrade, use the upgrade
- Next step: work on usability for controlflow package; explore Dave's idea of an assistant
- Offscreen bug
- Huy is looking into this
- Crash due to issues in loaded user packages (Tommy has fixed this)
- Web services (Tommy)
- Package is completed, and all but one of our examples are working
- Tommy will test if the Web services will work when there is a proxy
- Improve error handling/reporting (Tommy)
- we should have a single point for all error messages to pass through that is linked to both core.debug and GUI elements that display error messages.
- the goal is to implement a function that will be the single point for error messages, and that work without emitting GUI signals in core. As Huy suggested, we should have a GUI-wrapper which will be a no-op when the GUI is not instantiated---this will get rid of some of the pyqt dependencies (at least for the error messages).
- Improvements to VisTrails server communication with clients to better handle errors/crashes (Phillip)
- Currently the VisTrails servers consists of: a single-threaded instance with GUI; multiple GUI-less threads
- We will extend the server API to allow clients to check the server status. The server will provide a separate socket and the client will be able to 'ping' the server; if the server is working properly, it will respond; if it does not respond, the client will have a time out
- The API will also support the ability to kill both the GUI-less threads and the single-threaded GUI instance. We can then restart the server components using Emanuele's new script.
- We should also allow users to set a per-workflow time-out; if the workflow execution takes longer than the pre-defined threshold, VisTrails will abort the execution
- Update on fixing VTK Package (Wendel)
- It seems that the ParaView package has the same issues.
- Wendel has already checked in the new wrapper into the trunk; Emanuele will test it
- Synchronizing VisTrails and ALPS releases (Emanuele)
- VisTrails latex package was extended so that python is no longer required. If python is not present in the system, instead of issuing a request to execute the workflow and retrieve the image from the Web, a local, previously saved image will be used. It is now also possible to embed the images of workflows into the latex file. Documentation about these and other features are currently in the README file and examples (example.tex) provided with the latex package.
- Windows and Mac beta binaries already include ALPS
- Are we distributing ALPS as part of VisTrails then? This seems backwards. ALPS should VisTrails as part of their stuff, no?
- We will have separate binary distributions: one with and one without ALPS
- Is a multi-touch interface to VisTrails possible now that it's supported by Qt? (Phillip)
- Yes, Qt 4.6+ supports multi-tocuh --- Huy will look into this; the goal is to make this work both for the workflow builder/tree view and the spreadsheet
- We still need to figure out what makes sense for multi-touch, however, do we have requirements from ETH-Zurich?
- We want to map the zoom, pan, click, etc to multi-touch, so that VisTrails can be intractable on a multi-touch screen without a mouse
- What is the status of the Vismashup i{Phone|Pad|Touch} app? (Phillip)
- If you mean mac binary, there's an alpha version here: http://www.sci.utah.edu/~emanuele/files/vismashup
- Do you mean iphone app? [Phillip: yes]
- Wendel will look into this
- One of the issues is how to effectively handle images that are larger than the memory on the iPods and iPads---we need to process these on the server or use a model that allows the image to be manipulated on the client
- We also need to connect the app with the crowdlabs server, so that the app can get the list of mashups
- Phillip will investigate the feasibility of working directly with JavaScript and bypass flash
Oct 19, 2010
- Update on Trac and Roadmap (Emanuele and David)
- Trac is now linked to git repository
- Roadmap on Trac has been cleaned up so we can hide completed milestones
- Tickets are being reassigned/revisited. Many tickets (24) are not associated with a milestone yet https://vistrails.sci.utah.edu/report/3
- Synchronizing VisTrails and ALPS releases
- Web services support (Tommy)
- discuss the interaction between the Web Services package and upgrades
- Subworkflows update (Daniel)
- Testing of new module drawing (Erik)
- Aliases
- Currently, an alias is stored on a parameter. We need aliases to be stored at a higher level so that changing an alias is not a change to a parameter. The one issue is that there is a link between parmaeters and aliases in that an alias can only exist for versions that have the specified parameter. We might also have two versions where the alias points to different parameters. We could just store aliases as a root-level workflow element so that the set of aliases is versioned corrected.
- In the current implementation, the aliases parsed from the parameters are stored in a dictionary in the workflow, so storing aliases as a root-level workflows element might be the way to go --Emanuele 19:50, 15 October 2010 (MDT)
- Error logging
- Suggest that we have a single point for all error messages to pass through that is linked to both core.debug and GUI elements that display error messages. This should improve our error handling significantly
- PyQt dependencies in core
- Can we get rid of PyQt dependencies in core (and db)? This would probably require moving the signals/slots that we currently use to a similar Python implementation which shouldn't be too difficult. This would also require the specification of configuration widgets not as classes but rather some text that can be used to import the GUI elements only when using the GUI code. However, what would happen with workflows that run in command-line mode but require some graphical output?
- Ports
- One annoying part of the current VisTrails model is that users are not given any visual indication that a port value is already set by a function or vice versa by a connection. In addition, we have no way to specify or enforce cardinality on ports at design time. It seems like we could allow developers to specify when a port should not be connected to more than one value (via a function or a connection), and give some visual feedback when a port has already been specified in one manner.
Meeting Notes
- Bug regarding userpackages at initial vistrails startup causing startup to fail.
- Still an issue, ticket still open as a 1.6 milestone
- Copy-paste bug
- Shortcut not working on Mac. Focus makes shortcut try to copy pipeline.
- Current fix appends to clipboard as users type - this is not a good fix.
- Synchronizing ALPS and VisTrails releases
- 1.6 scheduled for Dec. 1
- Webservices
- New library being used, but it's very low level.
- Simple types may cause problems with some web-services using complex or XML-based types.
- Need to make sure that this library can handle at least MOST of the web-services out there.
- Sub-workflows
- Daniel's changes seem functional and very slick.
- A little more work on it seems necessary.
- Need some easy GUI mechanism to delete a subworkflow.
- Is the exclamation point in the module draw the best way to handle this? Make a tool-tip to explain it.
- VTK Wrapping
- Changes to VTK Python wrapping is causing headaches - particularly in backwards compatibility.
- Method typing has changed to expect things like "List" or "Vector" - makes things hard on the user when defining these inputs.
- Removing List and Vector types and replacing them with dynamically generated versions from Tuples.
- Aliases - Agenda pushed until after Vis.
- Error Logging - Unify some error handling to improve how exceptions are dealt with. More on this after Vis.
- Refactoring out PyQt dependencies from Core/Db/etc - Need to move all the signals/slots into GUI elements. Questions come when a workflow uses GUI elements as inputs. - More after vis.
- Ports - We need a way of showing how a port is specified - function vs. input port. Cardinality must be established in these cases. Ordering on multiple connections should be handled in some way - whether it's just a documentation issue or a more fundamental one.
- Should we have a tutorial mode that bring up tooltips when someone does something new as if it were a live tutorial.
Oct 12, 2010
- Trac and Roadmap (Emanuele): Emanuele suggested we should create a development roadmap and make it a permanent item on the agenda. The idea is to go through the Trac tickets and use them to build the roadmap that would be made public.
- go through trac and sort by priority
- add other suggestions to roadmap as well
- admins can add milestones via the admin tab
- David will assign tasks on list; those assigned need to follow up and check and fix them
- Update on PythonSource error reporting and logging (Tommy)
- Need to make sure that all messages are printed through debug.X, so that the debug level (verboseness in preferences) is correctly used
- Can cut/copy/paste now
- Can we prevent the delete from happening?
- Phillip noted the spyder project has an interactive python console: http://packages.python.org/spyder/
- Can press enter to get to the end
- Need to check if this works or if we need to change
- Now have ability to see stack trace from the GUI (triangle menu)
- Suggest using a dialog to display trace when this item is selected instead of printing to console
- Also saving stack trace to the execution log
- Also printing debug information to vistrails log
- Add trac item to fix prints to use debug
- Try to fix core.debug to have gui.debug to eliminate Qt dependency
- Update on subworkflow (Daniel)
- Notification when subworkflow is outdated triggered
- new_abstraction signal from registry in addition to new_module signal so that we can check if we need subworkflows to update
- How to display the possible upgrades (can be upgraded, can be upgraded but may break, etc.)
- New utility added by Carlos (currently at scripts/module_appearance) to draw fringes visually, which generates output that can be pasted directly into the add_module call.
- Erik will test this on Mac and Windows.
- Preparing a "Get Started" tutorial for SIGMOD repeatability
- Do we have instructions on how to use the latex package without crowdlabs?
- links to actual workflows in latex
- have README and example for latex in the source
- Metadata associated with VisTrails: should we have the ability to add vistrail-level metadata? e.g., who created the vistrail, its purpose, etc.
- have pointers to paper to workflow and vistrails
- add pointers to papers
- allow access to add/edit annotations at the vistrail level
- add GUI element to allow people to edit/add annotations
- Do we have instructions on how to use the latex package without crowdlabs?
- Maintaining VTK package
- have four classes that don't wrap, have error even in python level
- works well with VisTrails otherwise
- haven't checked the new changes from the wrapping
- Wendel will check on the new version
- Web services package (Tommy)
Oct 5, 2010
- Welcome Tommy!
- Update on subworkflow (Daniel)
- Issues raised by Matthias:
- need to automatically upgrade subworkflows. Currently it is necessary to manually upgrade a subworkflow whenever the version of one of the modules inside them changes.
- explorations do not detect an MplFigureCell embedded in a subworkflow.