Difference between revisions of "Development"
Line 14: | Line 14: | ||
* [General] provenance: should be creating a new version when we execute workflow with changed parameters? | * [General] provenance: should be creating a new version when we execute workflow with changed parameters? | ||
** currently, custom_params annotation stores this in provenance currently | ** currently, custom_params annotation stores this in provenance currently | ||
** [RR] API doesn't record provenance if passing in parameters: http://git.io/vsAA5 | |||
'''Items to Discuss''' | '''Items to Discuss''' | ||
Revision as of 15:29, 26 August 2015
See also the Github wiki
2015
August 26, 2015
Updates
- [TE] Added Job support to Parameter Explorations (Requested by Colin), and Mashups
- specify job ids, need to specify different ids for parameter explorations since they have the same version id
- how to deal with parameters passed in on command line
- [General] provenance: should be creating a new version when we execute workflow with changed parameters?
- currently, custom_params annotation stores this in provenance currently
- [RR] API doesn't record provenance if passing in parameters: http://git.io/vsAA5
Items to Discuss
August 19, 2015
Updates
- [RR] Improvements to tej, users' guide entry & example #1105
- [RR] Internal docs #1116
- On ReadTheDocs: vistrails.readthedocs.org
- [TE] Working on Job Monitor documentation
Items to Discuss
- [RR] JobMixin and JobMonitor: stable now?
- Definitely needs more tests
- [TE] Stop testing VisTrails 2.0?
August 12, 2015
Updates
Items to Discuss
- [RR] Internal docs #1116
- On ReadTheDocs: vistrails.readthedocs.org
- [RR] JobMixin and JobMonitor: stable now?
- Definitely needs more tests
- [RR] Improvements to tej, users' guide entry & example #1105
- [RR] Build broken on Travis; because of IPython 4 released today? (build 992; #1123)
August 5, 2015
Updates
- [TE] Fixes to Jobs
- Could not delete jobs
- Could not run job in group
- Job not reset when calling ModuleSuspended
- Added deleting job from context menu
- [RR] Writing documentation for everything #1105
Items to Discuss
July 29, 2015
Updates
- RR still looking into new interpreter thing
- Goal is to take out scheduling logic from Module so it can be split in multiple processes, and so that smarter strategies can be added in time
- This means some work on packages
- Spreadsheet can live in kernel process? Still some UI stuff that will take work (changing configuration, persistent archive's viewer, ...)
- [TE] Problems running examples #1111
- Testing of more examples requires additional packages on the test machines.
- Fixed faulty line-ending in PythonSource:s failing on Python 2.6.
- Test suite now testing SUDSWebServices (If web service is down, test suite will fail)
- preferences.py test failed reloading 'dialogs' package, switched to using 'URL'.
Items to Discuss
July 22, 2015
Updates
- [TE] mailing lists back online
- [TE] Working on #1107
- [RR] Working on ReproZip package
Items to Discuss
July 15, 2015
Updates
- 2.2.2 released
Items to Discuss
- RR is considering executing everything in an IPython kernel (i.e. separate Python interpreter, like the one spawned for a notebook)
- It's a separate process that we can restart without restarting the app/gui
- We can isolate the execution environment (e.g. for the server)
- We can run the whole thing remotely (if your data is elsewhere, just run VisTrails locally and the pipeline elsewhere)
- Pipeline execution no longer makes the interface hang, it just makes the kernel hang (but that's fine)
- We can use notebooks as modules (probably way nicer than the PythonSource module)
- We can run multiple kernels so long as the ports carry things that are serializable
- meaning we can put the multithreaded-interpreter without all the hacky parts it has now
- we can run IPython kernels in all the languages IPython supports, currently 46
July 8, 2015
Updates
Items to Discuss
- [TE] Buildbot github hook not working after IP address change
- [TE] No reference to VistrailsApplication (#1103)
- [RR] Added PR #1104
- [TE] Reopening VT file after saving with bundled subworkflow won't offer subworkflow upgrade (#1102)
- allow manual delete to fix right now
- fix this on top of the use-uuid branch
- [TE] Release VisTrails 2.2.1? (CHANGELOG)
July 1, 2015
Updates
- [TE] PROV fixed
- [TE] Working on subworkflow issues
- [RR] Considering reworking the controller (log vs exception problem, retained upgrades causing interferences) and also the interpreter (IPython?)
Items to Discuss
June 24, 2015
Updates
Items to Discuss
- UV-CDAT
- [TE] Can a cyclic workflow be valid? (#1097)
- focus on disabling the ability to create cyclic pipelines because more things break than just this with a cyclic pipeline
- [RR] Relative paths (#1057)
- This interacts with the new bundle; how do handle packing files inside the VT bundle?
June 17, 2015
Updates
Items to Discuss
- [TE] current_version and reusing existing upgrades are broken (ticket #1095)
- Could be that export to PROV is using an unflushed upgrade pipeline?
- current_version would then be correct.
- It may work to flush the actions before exporting?
June 10, 2015
Updates
- DAT: fixed VTK issue on Linux and Mac
- Still crashes on Windows. Need help! Reminder: this works in the VisTrails spreadsheet (QCellPresenter), although no widget get changed there during a drag
- Is it a VTK bug?
- Is it simply impossible to change widgets during the drag, should we do it a different way?
- Did I miss something that is done in VisTrails but somehow not in DAT?
- Low prio, UV-CDAT doesn't run on Windows anyway
Items to Discuss
- [TE] Executable Paper (ticket #1088 Updated pull request)
- Single instance code and batch mode
June 3, 2015
Updates
Items to Discuss
- [TE] Executable Paper (ticket #1088 pr)
- Requires fixes to command line parameters, Output modules, and batch mode
- How to test this
- Updated missing/outdated flags
- Fixed view issues when generating graphs
- [TE] batch mode
- SpreadsheetOutput not enabled in batch mode. Should we check is_running_gui instead?
- Other instance setting flags from caller
- Is graphsAsPdf replacing spreadsheetDumpPdf?
- graphsAsPdf true by Default?
- Batch mode executing by default (Not needed when generating graphs)
- Re-added workflowInfo as withWorkflowInfo for writing graph and xml workflow
- [DK] batch mode should be outputting to files or stdout, shouldn't always trigger SpreadsheetMode
- execute flag, maybe make execution the default and allow a "--no-execute" if you only want to capture graphs, for example
May 27, 2015
Updates
Items to Discuss
- [vistrails-users] API question
- Status of DAT
- Can add VCS plot to DAT but need to work on configuration windows
- Documentation for graphics templates for UV-CDAT/vcs?
- Qt support?
May 20, 2015
Updates
Items to Discuss
- [RR] Reviving the DAT, integrating scripting & porting UV-CDAT!
- On GitHub (issues)
- Merging 2 years of development taking longer than expected, but getting there. The plan is to get the patches in VisTrails and never fork again, we never want to get in UV-CDAT's situation (and don't need to).
- VTK cell works fine on Linux but there was flickering on Mac & Windows before; still issues on Mac (Windows status unknown)
- Can get a VCS plot soon (but will need VTK cell fix)
- Integrate in UV-CDAT's build system (so we have cdms, VCS, ...) -> RR can do this, low priority
- How do we integrate scripting?
- We want to be able to seamlessly make changes to a plot by changing Python code
- Define new plots by entering Python code without writing modules/packages?
- [RR] UV-CDAT needs #1073, please take a look
- [RR] Subworkflow issues
- [RR] #1074: can't load and edit a single pipeline if upgrade happen
May 13, 2015
Updates
- [RR] Test skipping whitelist (#1069) -- low priority
- [RR] Custom matplotlib modules can't be compatible with both 2.1 and 2.2 (#1067); should be fixed for ALPS (#1070)
Items to Discuss
- [RR] UV-CDAT needs #1073, please take a look
- [RR] Subworkflow issues
- [RR] #1074: can't load and edit a single pipeline if upgrade happen
May 7, 2015
Updates
- [RR] Export/import workflow to Python working!
Items to Discuss
- [Claudio] UV-CDAT
- The UV-CDAT project is the biggest user base of VisTrails
- VisTrails package management provides a lot of friction towards people plugging in their code
- Need to make it easy to integrate your random Python scripts in the system without having to deal with all the boilerplate, at least in the first step
- [RR] argues that modules are still good; UV-CDAT shouldn't move towards a purely script-based backend
- [RR] export/import with Python could reduce a lot of that friction by allowing 1) to edit workflow as Python 2) to open up boxes automatically if needed code doesn't match actual modules
- ...
April 29, 2015
Updates
Items to Discuss
- [RR]
Abstractionssubworkflows status (tickets) - [RR] matplotlib compatibility (2.1 & 2.2), #1067
- RR to try and fix ALPS matplotlib modules
April 22, 2015
Updates
- 2.2.0 has been released!
- Windows issue (via email from Ryan)
- issue with manifest file (may be a new file in VTK6?)
- Tommy has regenerated new Windows builds
- Binaries, pypi, and conda released
- [RR] Export as script
- Python sources using VTK need to switch to SetInputData (users should be aware of this)
April 15, 2015
Updates
- Ready for 2.2.0 (apart from binary/deps issues)
- Missing some libs (scikit-learn, tej, tdparser, SQLAlchemy+connectors
- Windows: runvistrails.py is no longer used, so the PATH is wrong
- Windows: pip is broken, but it probably wouldn't work anyway because of permission issues (disable this?)
- Queries, upgrades and getPipeline() usage (#1054)
- Getting a pipeline with getPipeline() is not safe: it might return an invalid pipeline
- This is used in many places throughout the code, like queries
- Upgrading would require going through the controller, but that creates new actions
- [TC] Avoid copying a module's output if it's used as input by exactly one downstream module (#1060) (useful for big numpy arrays you can update in-place)
April 8, 2015
Updates
- [DK] Merged RR's changes for output modules (1012 and 1013)
- RR will merge remaining changes, then create v2.2 branch!
Items to Discuss
- BNL need numpy array to VTK image
- Looks like VTK has some helpers for this
- We will help if issues arise
- Will contribute back to VisTrails package
- Upgrade issue: #1017
- Automatic upgrades should happen between versions of provided upgrades
- Our existing upgrades work around this so it doesn't need to be 2.2.0
- Corner-case VTK modules
- No longer need VTKCell input port, so don't interfere with registry and API anymore
- Still work weirdly, people probably shouldn't use them
- But we have lots of clunky modules since we wrap the whole of VTK; some people might rely on this and know how to use them, let's keep them anyway
- Ready for 2.2.0
April 1, 2015
Updates
- [RR] UV-CDAT: bugfixing for 2.2, long-term plans: implement scripting import/export in VisTrails, port to UV-CDAT
- Possibly, try to move to regular VisTrails to use new features
Items to Discuss
- 2.2 release: nothing much is pending anymore, release next week?
- add note to documentation about order of parameters in VTK
- add issue about exporter upgrades if not already there
March 25, 2015
Updates
Items to Discuss
- 2.2 release
- Checklist on Github: https://github.com/VisTrails/VisTrails/wiki/2.2-Release-Checklist
- Review output modules
- RR has a couple more issues to fix
- Ready to go -- sign app package for OSX? #984 We need access to the Apple Membership team
March 18, 2015
Updates
- [TE] VTK6 works
Items to Discuss
- [RR] Release v2.1.5 with backported tabledata?
- MTA example needs updated tabledata (for JoinTables)
- Google Maps package still not available
- [RR] Work torward v2.2.0?
- changes:
- new persistence
- API changes
- output module changes (upgrades?), maintain cells but try to upgrade
- not wrapping stuff
- VTK6? yes
- JobSubmission stuff?
- relabeling for upgrades #949
- makes sense, needs the tree view code to be updated, check selection
- See 2.2 checklist
- changes:
- Discussion of #1016
- plumbing between outputs and output modes, how to define a mode that works for many outputs without writing for each output?
March 11, 2015
Updates
- [RR] persistent_archive done; merge? (#755)
- note about the focus events for widgets
- TE be aware of file_archive for future binaries that include persistent_archive
- [TE] New VTK package finished
Items to Discuss
- [RR] Unmark UV-CDAT/VisTrails as a fork of VisTrails/VisTrails? (see last week; decision needed)
- RR email to JF about this
- yes; email sent to Github
- [RR] Switching order of output ports (#1006)
- added port specs are sorted at a separate spot (Module.*_port_specs properties) than those in the registry (which are sorted in the registry), but those two lists are just combined without respect to sort keys
- need to determine whether the two lists should be merged or remain distinct
- should make sure that order of input ports and output ports makes parallel connections for things with same order
- DK suggests breaking backward compatibility here: workflows still run, can fix easily if a problem in an existing package.
- [RR] Question about output modes (#1007), how to integrate in API (#24)
- Should ImageFileMode be removed? ("image" is not a mode, "file" is)
- ImageOutput missing?
- Feel free to change how formats works
- [TE] Test suite segfaults on Fedora 17 virtualbox. Install newer version? (Support ended in 2013)
March 4, 2015
Updates
- [TE] VTK wrapper
- Works on VTK 5.10
- Still need to test VTK 6
- New general wrappers for python functions and classes
Items to Discuss
- [RR] Unmark UV-CDAT/VisTrails as a fork of VisTrails/VisTrails?
- When filing a pull request from UV-CDAT/VisTrails, VisTrails/VisTrails is selected by default
- People keep forgetting to change the default (#956 #968 #999 #1000 #1003 #1004 #1005)
- Only way to change that is to not have it marked as a fork of VisTrails/VisTrails
- Github staff can make that change for us; should we do it?
- Juliana: comments on this? visibility vs. convenience/annoyance for developers
- [RR] What about #991?
February 25, 2015
Updates
Items to Discuss
- [TE] vtkviewcell for infovis support, can we unify with VTKCell?
- need to test this
- [TE] vtk wrapping
- Mostly finished
- VTK 5.10 produce incorrect results with old wrapping
- Old wrapper is based mostly on VTK 4
- Most vtk_examples affected
- [1]
- should be able to upgrade from SetInput to SetInputData (need to drop GetOutput and replace with self ports)
- can we change vtkInstance to just return self and not wrap things
- terminator example not working under 5.8?
- How does VTK wrapping fit into general wrapping framework?
- [RR] new persistence package
February 18, 2015
Updates
- [RR] New VisTrails API and IPython integration (#24)
Items to Discuss
- [TE] VTK wrapping
- Benchmarking vtk package
- Old: 24.7 seconds
- New: 10.5 seconds (Except first time that adds 8 sec)
- The parsing that calls is_abstract (that tries to instanciate all vtk classes) is now only run the first time.
- get_items_from_sigstring takes 2 seconds, maybe we can use a lookup dict for already computed sigstrings?
- Now using a general python function wrapper
- VTK classes are wrapped into python function that does not depend on vistrails
- VTK functions can be executed without vistrails
- The spec maps functions into vistrails modules, but can also describe wrapping
- A general python function wrapper that supports
- kwarg inputs
- single, list, dict outputs
- callback for progress reporting
- temporary file generator for using FilePool
- optional output generation
- Creating specs:
- Create spec by hand
- Auto-create spec outline (TODO) and manually finish it
- Dynamically create spec (VTK)
- Implement documentation wrappers (Can use scikitlearn wrapper to wrap numpydoc) (TODO)
- Classes as bad functions needs to be wrapped in new functions before they are wrapped. This is different for each package.
- Classes is hard: Like VTK, and matplotlib. Scikit-learn does still not wrap classes
- Spec diffing and patching could be done using code from matplotlib.
- Still needs upgrades from old VTK package
- Is it possible to dynamically wrap functions, e.g, you see a SetFunc and just remove the 'Set' prefix. Or do you need to create a complete mapping?
- Benchmarking vtk package
February 11, 2015
Updates
- Update from Friday's meeting
- discussed VisTrails internals
- discussed wrapping
- xml discussion, hard to modify because tied to db code
- TE has made it possible to add the schema-defined attributes to the intermediate representation
- higher-level operations on the port specs
- make sure the simple case works
- [JF] take a simple package with documentation and figure out what the base case for wrapping is
Items to Discuss
- [TE] VTK wrapping
- Dynamic loading works
- Reading XML is fast enough, but unserializing data is slow
- Working on patterns for patching
- Matplotlib has many advanced patterns like argument ordering, nested arguments, alternateSpecs, output types.
- Having all this in a general wrapper might confuse users?
- [RR] Delay module (except for identifiers) until you need it---e.g. don't deal with port specs, etc. until necessary
- Scripting Support #950
- [RR] Issue with getting code from modules
- Design a simple solution
- [JF] Couldn't you use modules as black boxes without conversion, just to call into modules/subworkflows easily from e.g. IPython?
- [RR] This is a job for the API, and a very separate use case. See #24
February 4, 2015
Updates
Items to Discuss
- Wrapping
- Format to use? Currently XML (like current matplotlib)
- JSON and YAML have simple "to python dictionary" methods
- But don't stream
- YAML a lot easier for humans
- [DK] vtk-new-package also changes parameter names, creates enumerations
- intermediate schema needs to be extensible
- packages will want to store there specific infos for compute() method generation
- also might have specs-altering info, like matplotlib's alternateSpec
- representation to code , registry already has schema for some aspects
- [RR] We might want to see if Module subclasses can be created lazily
- no need to create all the classes just to register them in the registry and never actually use most of them
- future effort
- Format to use? Currently XML (like current matplotlib)
- [RR] Where should VisTrails packages live?
- tej installs as 'vistrailspkg.tej', TE installed it as 'userpackages.tej'
- Currently, standard packages are 'vistrails.packages.', user packages are 'userpackages.' and packages loaded through pkg_resources might be anything
- [RR] Use 'vistrailspkg.' everywhere?
- Long-term effort to simplify package distribution/installation (and have VisTrails get them automatically?)
January 28, 2015
Updates
- T. Caswell to come visit on Fri 6 to discuss wrapping work
Items to Discuss
- [TE] New VTK wrapping
- Current code by DK seems a good deal faster
- Generates XML that can be patched/tweaked, generates Python code from it, VisTrails only loads generated Python code
- RR would rather have VisTrails load intermediate representation (= XML) directly, wants to make sure this is not slower
- The goal is to turn the intermediate step into something generic that would be used for every wrapped package (vtk, numpy, matplotlib, sklearn, java) instead of each having its own
- TC has its own code at github:VTTools which parses numpy docstrings and generates modules, doesn't yet handle classes or persist anything
- Web crawler
- Right now, TE starts jobs for "start crawler", "stop crawler", "install classifier"
- RR would rather have the crawling be a job as far as VisTrails and tej are concerned
- The whole thing would be one pipeline: load examples, train classifier, start crawler [check for job, kill previous one, upload model, start processes], get snapshot, visualize
- Need some support in tej and job submission system: long-running jobs, stop a job (wait for it to finish?), restart a job even though results are cached
January 21, 2015
Updates
Items to Discuss
- [RR] Unified wrapping method discussion #991
- TE to work on reusable method with intermediate representation, starting with VTK
- [RR] Examples for scikit-learn: JF has an old example using Weka with parameter exploration (not currently in source tree)
- AM's examples are enough
- [AM] scikit-learn package is done, merge it in? #955
- RR will merge
- [RR] What should copyright headers say? #994
- Let's keep everything in there: Utah/Poly/NYU
January 14, 2015
Updates
- [TE] Working on classifier
- [RR] Scripting integration, work in progress
Items to Discuss
- [RR] Unified wrapping method discussion (#991)
- Let's talk next week, [AM] and [DK] are not here
January 7, 2015
Updates
Items to Discuss
- make sure that we address critical issues, questions, and pending review branches in a timely manner
- scripting support
- [RR] no issues if we want to just keep annotations in the generated code to allow the link back to a workflow
- [RR] can translate from workflow to script, working on script to workflow
- will work for parameter value changes, structural changes require changes to the annotations
- need to publish best practices here
- would be cool to do looping in scripts (easier interface than with workflows)
- notebook support (convert form notebook to workflow)
- RR will sync with FC on this
- Issue with console in built-from-scratch
- [TC] iPython rearranged some of the completion stuff in 2.2 and 2.3
- binary has old version of iPython -> 1.0.0, should we update?
- [TC] automated wrapping of numpy and scipy
- discovered a bunch of malformed documentation in numpy and scipy
- has github repo for vistrails tools
- example modules wrap a bunch of R stuff (not baked in, just how things are)
- will be pushing wrapping logic up
- port names forbidden (window and domain)
- have an import hook to get from yaml directly to VisTrails Modules
- should work for any python modules with well-formed numpy docstrings.
- [Action] should make it clear in documentation that Constant now means serializable not that the value doesn't change (e.g. List)
- [TC] might be interesting to try to build components of matplotlib and accumulate in figure (long-term project, but thinking about how this might work)
- [TE] build and build scripts
- completely automatic, buildbot
- need to set the build machines for the environment we want for the binary
- would virtualenv work here?
- [TC] anaconda can pin versions, potential path to test different configurations
- Q: upload nightly binary builds? A: makes sense, make sure they are well-labeled
- sourceforge stats: e.g. http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07
- package issues (see Remi's message)
- [TE] Scope of tej
- Support single ssh commands?
- Queue can be used as a remote machine (crawler is using queue.call*)
- SourceForge stats: http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07