Development
See also the Github wiki
2015
February 11, 2015
Updates
Items to Discuss
- [TE] VTK wrapping
- Dynamic loading works
- Reading XML is fast enough, but serializing data is slow)
- Working on patterns for patching
- Matplotlib has many advanced patterns like argument ordering, nested arguments, alternateSpecs, output types.
- Having all this in a general wrapper might confuse users?
February 4, 2015
Updates
Items to Discuss
- Wrapping
- Format to use? Currently XML (like current matplotlib)
- JSON and YAML have simple "to python dictionary" methods
- But don't stream
- YAML a lot easier for humans
- [DK] vtk-new-package also changes parameter names, creates enumerations
- intermediate schema needs to be extensible
- packages will want to store there specific infos for compute() method generation
- also might have specs-altering info, like matplotlib's alternateSpec
- representation to code , registry already has schema for some aspects
- [RR] We might want to see if Module subclasses can be created lazily
- no need to create all the classes just to register them in the registry and never actually use most of them
- future effort
- Format to use? Currently XML (like current matplotlib)
- [RR] Where should VisTrails packages live?
- tej installs as 'vistrailspkg.tej', TE installed it as 'userpackages.tej'
- Currently, standard packages are 'vistrails.packages.', user packages are 'userpackages.' and packages loaded through pkg_resources might be anything
- [RR] Use 'vistrailspkg.' everywhere?
- Long-term effort to simplify package distribution/installation (and have VisTrails get them automatically?)
January 28, 2015
Updates
- T. Caswell to come visit on Fri 6 to discuss wrapping work
Items to Discuss
- [TE] New VTK wrapping
- Current code by DK seems a good deal faster
- Generates XML that can be patched/tweaked, generates Python code from it, VisTrails only loads generated Python code
- RR would rather have VisTrails load intermediate representation (= XML) directly, wants to make sure this is not slower
- The goal is to turn the intermediate step into something generic that would be used for every wrapped package (vtk, numpy, matplotlib, sklearn, java) instead of each having its own
- TC has its own code at github:VTTools which parses numpy docstrings and generates modules, doesn't yet handle classes or persist anything
- Web crawler
- Right now, TE starts jobs for "start crawler", "stop crawler", "install classifier"
- RR would rather have the crawling be a job as far as VisTrails and tej are concerned
- The whole thing would be one pipeline: load examples, train classifier, start crawler [check for job, kill previous one, upload model, start processes], get snapshot, visualize
- Need some support in tej and job submission system: long-running jobs, stop a job (wait for it to finish?), restart a job even though results are cached
January 21, 2015
Updates
Items to Discuss
- [RR] Unified wrapping method discussion #991
- TE to work on reusable method with intermediate representation, starting with VTK
- [RR] Examples for scikit-learn: JF has an old example using Weka with parameter exploration (not currently in source tree)
- AM's examples are enough
- [AM] scikit-learn package is done, merge it in? #955
- RR will merge
- [RR] What should copyright headers say? #994
- Let's keep everything in there: Utah/Poly/NYU
January 14, 2015
Updates
- [TE] Working on classifier
- [RR] Scripting integration, work in progress
Items to Discuss
- [RR] Unified wrapping method discussion (#991)
- Let's talk next week, [AM] and [DK] are not here
January 7, 2015
Updates
Items to Discuss
- make sure that we address critical issues, questions, and pending review branches in a timely manner
- scripting support
- [RR] no issues if we want to just keep annotations in the generated code to allow the link back to a workflow
- [RR] can translate from workflow to script, working on script to workflow
- will work for parameter value changes, structural changes require changes to the annotations
- need to publish best practices here
- would be cool to do looping in scripts (easier interface than with workflows)
- notebook support (convert form notebook to workflow)
- RR will sync with FC on this
- Issue with console in built-from-scratch
- [TC] iPython rearranged some of the completion stuff in 2.2 and 2.3
- binary has old version of iPython -> 1.0.0, should we update?
- [TC] automated wrapping of numpy and scipy
- discovered a bunch of malformed documentation in numpy and scipy
- has github repo for vistrails tools
- example modules wrap a bunch of R stuff (not baked in, just how things are)
- will be pushing wrapping logic up
- port names forbidden (window and domain)
- have an import hook to get from yaml directly to VisTrails Modules
- should work for any python modules with well-formed numpy docstrings.
- [Action] should make it clear in documentation that Constant now means serializable not that the value doesn't change (e.g. List)
- [TC] might be interesting to try to build components of matplotlib and accumulate in figure (long-term project, but thinking about how this might work)
- [TE] build and build scripts
- completely automatic, buildbot
- need to set the build machines for the environment we want for the binary
- would virtualenv work here?
- [TC] anaconda can pin versions, potential path to test different configurations
- Q: upload nightly binary builds? A: makes sense, make sure they are well-labeled
- sourceforge stats: e.g. http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07
- package issues (see Remi's message)
- [TE] Scope of tej
- Support single ssh commands?
- Queue can be used as a remote machine (crawler is using queue.call*)
- SourceForge stats: http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07