Development/2015

From VistrailsWiki
Jump to navigation Jump to search

December 30, 2015

Updates

  • Bokeh package finished
    • Can output to Browser/IPython/Spreadsheet (Limited functionality in spreadsheet)

Items to Discuss

  • Updating VisTrails dependencies
    • Bokeh needs QWebEngine (Chromium) in Qt5 which requires PyQt5
      • Many plotting libraries only work in browsers
      • But Chromium may be a big and controversial dependency
      • Or create a browser version of the spreadsheet?
    • Some users have requested Python 3
    • Not backwards compatible?
    • Target for VisTrails 3.0 with new interpreter?

December 16, 2015

Updates Items to Discuss

  • Users list question
  • [TE] Wrapping Bokeh
    • Added bokeh property parser
      • complements numpydoc parser
      • Not all attributes are available, partially because I am using a pre-release version.
    • Wrapped modules
      • bokeh.plotting finished (Backwards like matplotlib: Create plot then draw on it)
      • bokeh.charts mostly done (Missing arguments)
    • QWebView mostly works (but no zooming or buttons)
      • It would be better to use external browser or QWebEngineView
    • Added improvements to wrapper
      • Generalized port translations (Color/Path to native types)
      • Can use multiple docstring/property parsers simultaneously

December 9, 2015

Updates

Items to Discuss

  • [RR] Looking into wrapper
    • Will port TensorFlow
    • Doing a pass on the wrapper code & doc
      • UX improvements
    • Factorize high-level docstring parsers into core.wrapper? (sphinx, numpy, google)
    • Problem with name vs module_name (fails sklearn tests because class_by_name() is broken)
    • Looking into compressing specs, lazy registration of modules

December 2, 2015

Updates

  • [TE] Package wrapping
    • Wrapped most of numpy/scipy using __all__
    • Added class attribute and method access for classes (Can add to class or to separate inspector module)
    • TODO: Document wrapping procedure

Items to Discuss

November 25, 2015

Updates

  • [RR] New interpreter is coming along
    • Everything is a stream
    • Task system, eventually work-stealing parallelism
    • Streams no longer advance in lock-step
      • Allows for constructions like filter, join, sorted-merge
    • depth>1 still to be tested, no current plan to allow this through the module interface
      • Through looping though, can do streams of streams

Items to Discuss

  • [TE] Package wrapping
    • Wrapped most numpy functions (no polynomials)
    • and half of scipy classes and functions
    • Added functions manually from documentation (no parseable lists?)
    • Focusing on data flow constructs (no property/method access for classes)
      • Using spec diff (empty for now) and dynamic parser (stores spec in .vistrails/)
    • Python class/function wrapper now stable
    • Add google doc parser?
      • Is there a parser available?

November 18, 2015

Updates

Items to Discuss

  • [TE] Package wrapping
    • Numpy class wrapping works
      • Basic type is List, since most types are array_likes.
      • TODO: More classes and functions
      • Some operations are in-place and some aren't, and docstring not super clear
      • [RR] It seems only methods mutate, and most have a numpy.xx function equivalent; just manually go through the methods and remove the mutating ones?
    • Class wrapper is modular
      • Docstring parser and type string parser can be customized
      • Classes can have optional attribute/method ports
        • Inspectors and attribute/method modules can be created separately
    • Use PythonCalc as an example
      • Can be re-implemented as a function with a parseable docstring
      • [RR] PythonCalc is just an example, and it might actually make sense to get rid of it, or do a proper math package with scalar operations (as separate modules, no combobox)
    • [RR] TensorFlow to use wrapping as well
      • Very simple (only types 'tensor' & 'variable')

November 11, 2015

Updates

Items to Discuss

  • [TE] Package wrapping
    • Wrapping numpy's classes using numpydoc (ndarray)
      • Wrapping constructor arguments, attribute getters/setters as module ports
        • Then how to access attributes afterwards?
          • Use input value port to class modules?
      • Wrap methods as modules? E.g., `ndarray.shape`.
      • Function wrapping is almost a subset of class wrapping
        • May be able to use the same parsing/execution methods
    • VTK's non-getter/setter methods could be wrapped as modules
      • We could then remove extra logic in interpreter for keeping function order
    • Bokeh uses autogenerated docstrings
      • We may be able to read the specification directly
  • [RR] TensorFlow package
    • Basic setup working, can execute the Mandelbrot example
    • Will autogenerate the operations

November 4, 2015

Updates

  • [RR] Interpreter work requires a fix for DB issue #1137
  • [DK] Kitware's Resonant [1]
    • Girder: data management system
    • Romanesco: execution engine, uses Celery for task management
    • Resonant Flow: web app for editing and executing workflows

Items to Discuss

  • [TE] Added OSX Lion (10.7) VM on build machine using vagrant (seems ok with license?)
    • New builds works on Lion
    • Change the minimum requirement to 10.7?
  • [TE] Package wrapping
  • Added upgrade suggestions using 2 spec versions (example)
    • using name edit distance to find matches
    • Could use something better like port similarity for modules, and type similarity for ports
    • Should write python upgrade code?
    • Added spec to sklearn package
      • Can now diff spec versions and keep the spec static
      • There will be problems with downgrades and package versions

October 29, 2015

Updates

  • [TE] nightly-build-mac fixed (Needed to approve xcode license)

Items to Discuss

  • [TE] matplotlib wrapper
    • ported to general spec
    • Diff can be used on general spec
      • Only needed minor changes
      • Fixed indexing bug that corrupted specs
      • Can now move diff tools to core/wrapper
    • Add new plots?
      • New plots in mpl 1.3: eventplot
      • New plots in mpl 1.4: angle_spectrum, magnitude_spectrum, phase_spectrum', violinplot
      • No new plots in mpl 1.5 (from looking at boilerplate.py)

October 21, 2015

Updates

  • [RR] Alexis has arrived, will be working with [RR] on a more efficient interpreter

Items to Discuss

  • [TE] New matplotlib package
    • Supporting multiple package versions
      • Check which version can be loaded #1135
      • We cannot show version requirements in the list of packages because old packages are loaded when importing the codepath
    • Visual diff does not work well, but we can use the spec differ to see differences.
    • Unifying vtk and matplotlib wrapping specs (WIP)
      • And porting matplotlibs spec differ to work on the general spec
      • Spec differ can then be used on vtk and other packages in the future
      • Will enable automatic upgrade generation
  • [RR] Rework interpreter
    • restore abstraction between interpreter/module code
    • build looping, streaming, caching into the interpreter
    • rework cache
    • look into parallelism once basic functionality is there

October 14, 2015

Updates

Items to Discuss

October 7, 2015

Updates

Items to Discuss

  • [TE] New matplotlib parser (adding numpydoc parser)
    • Added numpydoc attribute parser for plots.
      • TODO: Need better port spec reconciliation with call signature parser.
      • We can create a general numpydoc parser, but attribute types are unique to matplotlib.
    • Needs package versioning
      • New spec will not support matplotlib < v1.4 due to changed path to axes classes.
      • Load spec version corresponding to installed matplotlib version?
      • Need version downgrades?

September 30, 2015

Updates

Items to Discuss

  • [TE] Updating Matplotlib parser
    • Matplotlib docstring parser fails on numpy docstrings
    • I have added basic numpydoc parsing
      • Only used by a few docstrings so far
      • Parsing uses many sources (class tables, signatures, ACCEPTS, method docstrings, definition parsing).
    • Caswell said they were thinking about moving to traitlets, but this is not ready yet

September 23, 2015

Updates

  • [TE] VisTrails 2.2.3 released
    • Also have pushed to PyPI, binstar, etc.
    • Sourceforge vs. GitHub
      • should be able to host releases on GitHub
      • nightly binaries pushed to sf each night
      • old binaries?
  • [TE] Re-wrapping MatPlotlib
    • Keep static generation - Docstrings are brittle beween matplotlib versions
    • Keep generating the executable classes - but create functions, not vistrails Modules
      • Why is there so much patching?
    • Re-implemented Alternate PortSpec for InputPortSpec
      • It will now inherit specs from parent (No need to reimplement)
    • Module upgrades?

Items to Discuss

September 16, 2015

Updates

  • [TE] Job Monitor tests and documentation done
    • OK to leave document package in packages directory?

Items to Discuss

  • [TE] Library wrapping: How to do code patching? Wrapping diagram
    • Insert code into module template (Old VTK code)
    • Patch the library that is called (New VTK code)
    • Somehow store code in spec and apply when executed?
      • Executed code needs at a minimum access to inputs, outputs, current class
      • Use standard input/output dict and obj reference that the code operates on?
      • Can this be done while keeping the execution abstract?
  • Do new release now?

September 9, 2015

Updates

Items to Discuss

  • New release?
    • [RR] wants reprounzip
    • tej changes & doc
    • job monitor & jobmixin fixes
    • mongodb
    • reprounzip
    • warning: MplFigure type is int
    • tabledata: add Reshape, DictoToTable & ListToTable, fix ListToTable with numpy arrays
    • PythonSource can have same name for an input & output
    • don't show spreadsheet at exit

September 2, 2015

Updates

Items to Discuss

  • [TE] Limit autosaves? #1126

August 26, 2015

Updates

  • [TE] Added Job support to Parameter Explorations (Requested by Colin), and Mashups
    • specify job ids, need to specify different ids for parameter explorations since they have the same version id
    • how to deal with parameters passed in on command line
  • [General] provenance: should be creating a new version when we execute workflow with changed parameters?
    • currently, custom_params annotation stores this in provenance currently
    • [RR] API doesn't record provenance if passing in parameters: http://git.io/vsAA5

Items to Discuss

August 19, 2015

Updates

Items to Discuss

  • [RR] JobMixin and JobMonitor: stable now?
    • Definitely needs more tests
  • [TE] Stop testing VisTrails 2.0?

August 12, 2015

Updates

Items to Discuss

  • [RR] Internal docs #1116
  • [RR] JobMixin and JobMonitor: stable now?
    • Definitely needs more tests
  • [RR] Improvements to tej, users' guide entry & example #1105
  • [RR] Build broken on Travis; because of IPython 4 released today? (build 992; #1123)

August 5, 2015

Updates

  • [TE] Fixes to Jobs
    • Could not delete jobs
    • Could not run job in group
    • Job not reset when calling ModuleSuspended
    • Added deleting job from context menu
  • [RR] Writing documentation for everything #1105

Items to Discuss

July 29, 2015

Updates

  • RR still looking into new interpreter thing
    • Goal is to take out scheduling logic from Module so it can be split in multiple processes, and so that smarter strategies can be added in time
    • This means some work on packages
    • Spreadsheet can live in kernel process? Still some UI stuff that will take work (changing configuration, persistent archive's viewer, ...)
  • [TE] Problems running examples #1111
    • Testing of more examples requires additional packages on the test machines.
    • Fixed faulty line-ending in PythonSource:s failing on Python 2.6.
    • Test suite now testing SUDSWebServices (If web service is down, test suite will fail)
    • preferences.py test failed reloading 'dialogs' package, switched to using 'URL'.

Items to Discuss

July 22, 2015

Updates

  • [TE] mailing lists back online
  • [TE] Working on #1107
  • [RR] Working on ReproZip package

Items to Discuss

  • [RR] MongoDB package #1106
  • [RR] Example for tej docs? #1105

July 15, 2015

Updates

  • 2.2.2 released

Items to Discuss

  • RR is considering executing everything in an IPython kernel (i.e. separate Python interpreter, like the one spawned for a notebook)
    • It's a separate process that we can restart without restarting the app/gui
    • We can isolate the execution environment (e.g. for the server)
    • We can run the whole thing remotely (if your data is elsewhere, just run VisTrails locally and the pipeline elsewhere)
    • Pipeline execution no longer makes the interface hang, it just makes the kernel hang (but that's fine)
    • We can use notebooks as modules (probably way nicer than the PythonSource module)
    • We can run multiple kernels so long as the ports carry things that are serializable
      • meaning we can put the multithreaded-interpreter without all the hacky parts it has now
      • we can run IPython kernels in all the languages IPython supports, currently 46

July 8, 2015

Updates

Items to Discuss

  • [TE] Buildbot github hook not working after IP address change
  • [TE] No reference to VistrailsApplication (#1103)
  • [TE] Reopening VT file after saving with bundled subworkflow won't offer subworkflow upgrade (#1102)
    • allow manual delete to fix right now
    • fix this on top of the use-uuid branch
  • [TE] Release VisTrails 2.2.1? (CHANGELOG)

July 1, 2015

Updates

  • [TE] PROV fixed
  • [TE] Working on subworkflow issues
  • [RR] Considering reworking the controller (log vs exception problem, retained upgrades causing interferences) and also the interpreter (IPython?)

Items to Discuss

June 24, 2015

Updates

Items to Discuss

  • UV-CDAT
  • [TE] Can a cyclic workflow be valid? (#1097)
    • focus on disabling the ability to create cyclic pipelines because more things break than just this with a cyclic pipeline
  • [RR] Relative paths (#1057)
    • This interacts with the new bundle; how do handle packing files inside the VT bundle?

June 17, 2015

Updates

Items to Discuss

  • [TE] current_version and reusing existing upgrades are broken (ticket #1095)
    • Could be that export to PROV is using an unflushed upgrade pipeline?
    • current_version would then be correct.
    • It may work to flush the actions before exporting?

June 10, 2015

Updates

  • DAT: fixed VTK issue on Linux and Mac
  • Still crashes on Windows. Need help! Reminder: this works in the VisTrails spreadsheet (QCellPresenter), although no widget get changed there during a drag
    • Is it a VTK bug?
    • Is it simply impossible to change widgets during the drag, should we do it a different way?
    • Did I miss something that is done in VisTrails but somehow not in DAT?
    • Low prio, UV-CDAT doesn't run on Windows anyway

Items to Discuss

June 3, 2015

Updates

Items to Discuss

  • [TE] Executable Paper (ticket #1088 pr)
    • Requires fixes to command line parameters, Output modules, and batch mode
    • How to test this
    • Updated missing/outdated flags
    • Fixed view issues when generating graphs
  • [TE] batch mode
    • SpreadsheetOutput not enabled in batch mode. Should we check is_running_gui instead?
    • Other instance setting flags from caller
    • Is graphsAsPdf replacing spreadsheetDumpPdf?
    • graphsAsPdf true by Default?
    • Batch mode executing by default (Not needed when generating graphs)
    • Re-added workflowInfo as withWorkflowInfo for writing graph and xml workflow
    • [DK] batch mode should be outputting to files or stdout, shouldn't always trigger SpreadsheetMode
    • execute flag, maybe make execution the default and allow a "--no-execute" if you only want to capture graphs, for example

May 27, 2015

Updates

Items to Discuss

May 20, 2015

Updates

Items to Discuss

  • [RR] Reviving the DAT, integrating scripting & porting UV-CDAT!
    • On GitHub (issues)
    • Merging 2 years of development taking longer than expected, but getting there. The plan is to get the patches in VisTrails and never fork again, we never want to get in UV-CDAT's situation (and don't need to).
    • VTK cell works fine on Linux but there was flickering on Mac & Windows before; still issues on Mac (Windows status unknown)
    • Can get a VCS plot soon (but will need VTK cell fix)
    • Integrate in UV-CDAT's build system (so we have cdms, VCS, ...) -> RR can do this, low priority
    • How do we integrate scripting?
      • We want to be able to seamlessly make changes to a plot by changing Python code
      • Define new plots by entering Python code without writing modules/packages?
  • [RR] UV-CDAT needs #1073, please take a look
  • [RR] Subworkflow issues
    • #1065: missing attribute (fixed)
    • #1066: groups in subworkflows were ignored when finding dependencies (fixed)
    • #725: "upgrade subworkflow" button not popping up (fixed); also, namespace issue
    • #1071: subworkflows don't go through upgrades
  • [RR] #1074: can't load and edit a single pipeline if upgrade happen

May 13, 2015

Updates

  • [RR] Test skipping whitelist (#1069) -- low priority
  • [RR] Custom matplotlib modules can't be compatible with both 2.1 and 2.2 (#1067); should be fixed for ALPS (#1070)

Items to Discuss

  • [RR] UV-CDAT needs #1073, please take a look
  • [RR] Subworkflow issues
    • #1065: missing attribute (fixed)
    • #1066: groups in subworkflows were ignored when finding dependencies (fixed)
    • #725: "upgrade subworkflow" button not popping up (fixed); also, namespace issue
    • #1071: subworkflows don't go through upgrades
  • [RR] #1074: can't load and edit a single pipeline if upgrade happen

May 7, 2015

Updates

  • [RR] Export/import workflow to Python working!

Items to Discuss

  • [Claudio] UV-CDAT
    • The UV-CDAT project is the biggest user base of VisTrails
    • VisTrails package management provides a lot of friction towards people plugging in their code
    • Need to make it easy to integrate your random Python scripts in the system without having to deal with all the boilerplate, at least in the first step
    • [RR] argues that modules are still good; UV-CDAT shouldn't move towards a purely script-based backend
    • [RR] export/import with Python could reduce a lot of that friction by allowing 1) to edit workflow as Python 2) to open up boxes automatically if needed code doesn't match actual modules
    • ...

April 29, 2015

Updates

Items to Discuss

  • [RR] Abstractions subworkflows status (tickets)
  • [RR] matplotlib compatibility (2.1 & 2.2), #1067
    • RR to try and fix ALPS matplotlib modules

April 22, 2015

Updates

  • 2.2.0 has been released!
  • Windows issue (via email from Ryan)
    • issue with manifest file (may be a new file in VTK6?)
    • Tommy has regenerated new Windows builds
  • Binaries, pypi, and conda released
  • [RR] Export as script
  • Python sources using VTK need to switch to SetInputData (users should be aware of this)

April 15, 2015

Updates

  • Ready for 2.2.0 (apart from binary/deps issues)
    • Missing some libs (scikit-learn, tej, tdparser, SQLAlchemy+connectors
    • Windows: runvistrails.py is no longer used, so the PATH is wrong
    • Windows: pip is broken, but it probably wouldn't work anyway because of permission issues (disable this?)
  • Queries, upgrades and getPipeline() usage (#1054)
    • Getting a pipeline with getPipeline() is not safe: it might return an invalid pipeline
    • This is used in many places throughout the code, like queries
    • Upgrading would require going through the controller, but that creates new actions
  • [TC] Avoid copying a module's output if it's used as input by exactly one downstream module (#1060) (useful for big numpy arrays you can update in-place)

April 8, 2015

Updates

  • [DK] Merged RR's changes for output modules (1012 and 1013)
    • RR will merge remaining changes, then create v2.2 branch!

Items to Discuss

  • BNL need numpy array to VTK image
    • Looks like VTK has some helpers for this
    • We will help if issues arise
    • Will contribute back to VisTrails package
  • Upgrade issue: #1017
    • Automatic upgrades should happen between versions of provided upgrades
    • Our existing upgrades work around this so it doesn't need to be 2.2.0
  • Corner-case VTK modules
    • No longer need VTKCell input port, so don't interfere with registry and API anymore
    • Still work weirdly, people probably shouldn't use them
    • But we have lots of clunky modules since we wrap the whole of VTK; some people might rely on this and know how to use them, let's keep them anyway
    • Ready for 2.2.0

April 1, 2015

Updates

  • [RR] UV-CDAT: bugfixing for 2.2, long-term plans: implement scripting import/export in VisTrails, port to UV-CDAT
  • Possibly, try to move to regular VisTrails to use new features

Items to Discuss

  • 2.2 release: nothing much is pending anymore, release next week?
    • [TE] vtkExporter classes #1032
    • [RR] Hiding upgrades in version tree might make it (or might be 2.2.1 so we can test it out) #949
    • Output modules changes to go in
  • add note to documentation about order of parameters in VTK
  • add issue about exporter upgrades if not already there

March 25, 2015

Updates

Items to Discuss

March 18, 2015

Updates

  • [TE] VTK6 works

Items to Discuss

  • [RR] Release v2.1.5 with backported tabledata?
    • MTA example needs updated tabledata (for JoinTables)
    • Google Maps package still not available
  • [RR] Work torward v2.2.0?
    • changes:
      • new persistence
      • API changes
      • output module changes (upgrades?), maintain cells but try to upgrade
      • not wrapping stuff
      • VTK6? yes
      • JobSubmission stuff?
      • relabeling for upgrades #949
    • makes sense, needs the tree view code to be updated, check selection
    • See 2.2 checklist
  • Discussion of #1016
    • plumbing between outputs and output modes, how to define a mode that works for many outputs without writing for each output?

March 11, 2015

Updates

  • [RR] persistent_archive done; merge? (#755)
    • note about the focus events for widgets
    • TE be aware of file_archive for future binaries that include persistent_archive
  • [TE] New VTK package finished

Items to Discuss

  • [RR] Unmark UV-CDAT/VisTrails as a fork of VisTrails/VisTrails? (see last week; decision needed)
    • RR email to JF about this
    • yes; email sent to Github
  • [RR] Switching order of output ports (#1006)
    • added port specs are sorted at a separate spot (Module.*_port_specs properties) than those in the registry (which are sorted in the registry), but those two lists are just combined without respect to sort keys
    • need to determine whether the two lists should be merged or remain distinct
    • should make sure that order of input ports and output ports makes parallel connections for things with same order
    • DK suggests breaking backward compatibility here: workflows still run, can fix easily if a problem in an existing package.
  • [RR] Question about output modes (#1007), how to integrate in API (#24)
    • Should ImageFileMode be removed? ("image" is not a mode, "file" is)
    • ImageOutput missing?
    • Feel free to change how formats works
  • [TE] Test suite segfaults on Fedora 17 virtualbox. Install newer version? (Support ended in 2013)

March 4, 2015

Updates

  • [TE] VTK wrapper
    • Works on VTK 5.10
    • Still need to test VTK 6
    • New general wrappers for python functions and classes

Items to Discuss

  • [RR] Unmark UV-CDAT/VisTrails as a fork of VisTrails/VisTrails?
  • [RR] What about #991?

February 25, 2015

Updates

Items to Discuss

  • [TE] vtkviewcell for infovis support, can we unify with VTKCell?
    • need to test this
  • [TE] vtk wrapping
    • Mostly finished
    • VTK 5.10 produce incorrect results with old wrapping
      • Old wrapper is based mostly on VTK 4
      • Most vtk_examples affected
      • [2]
      • should be able to upgrade from SetInput to SetInputData (need to drop GetOutput and replace with self ports)
      • can we change vtkInstance to just return self and not wrap things
    • terminator example not working under 5.8?
  • How does VTK wrapping fit into general wrapping framework?
  • [RR] new persistence package

February 18, 2015

Updates

  • [RR] New VisTrails API and IPython integration (#24)

Items to Discuss

  • [TE] VTK wrapping
    • Benchmarking vtk package
      • Old: 24.7 seconds
      • New: 10.5 seconds (Except first time that adds 8 sec)
        • The parsing that calls is_abstract (that tries to instanciate all vtk classes) is now only run the first time.
        • get_items_from_sigstring takes 2 seconds, maybe we can use a lookup dict for already computed sigstrings?
    • Now using a general python function wrapper
      • VTK classes are wrapped into python function that does not depend on vistrails
      • VTK functions can be executed without vistrails
      • The spec maps functions into vistrails modules, but can also describe wrapping
      • A general python function wrapper that supports
        • kwarg inputs
        • single, list, dict outputs
        • callback for progress reporting
        • temporary file generator for using FilePool
        • optional output generation
      • Creating specs:
          • Create spec by hand
          • Auto-create spec outline (TODO) and manually finish it
          • Dynamically create spec (VTK)
          • Implement documentation wrappers (Can use scikitlearn wrapper to wrap numpydoc) (TODO)
          • Classes as bad functions needs to be wrapped in new functions before they are wrapped. This is different for each package.
            • Classes is hard: Like VTK, and matplotlib. Scikit-learn does still not wrap classes
          • Spec diffing and patching could be done using code from matplotlib.
    • Still needs upgrades from old VTK package
      • Is it possible to dynamically wrap functions, e.g, you see a SetFunc and just remove the 'Set' prefix. Or do you need to create a complete mapping?

February 11, 2015

Updates

  • Update from Friday's meeting
    • discussed VisTrails internals
    • discussed wrapping
      • xml discussion, hard to modify because tied to db code
      • TE has made it possible to add the schema-defined attributes to the intermediate representation
      • higher-level operations on the port specs
    • make sure the simple case works
      • [JF] take a simple package with documentation and figure out what the base case for wrapping is

Items to Discuss

  • [TE] VTK wrapping
    • Dynamic loading works
    • Reading XML is fast enough, but unserializing data is slow
    • Working on patterns for patching
    • Matplotlib has many advanced patterns like argument ordering, nested arguments, alternateSpecs, output types.
      • Having all this in a general wrapper might confuse users?
    • [RR] Delay module (except for identifiers) until you need it---e.g. don't deal with port specs, etc. until necessary
  • Scripting Support #950
    • [RR] Issue with getting code from modules
    • Design a simple solution
    • [JF] Couldn't you use modules as black boxes without conversion, just to call into modules/subworkflows easily from e.g. IPython?
      • [RR] This is a job for the API, and a very separate use case. See #24

February 4, 2015

Updates

Items to Discuss

  • Wrapping
    • Format to use? Currently XML (like current matplotlib)
      • JSON and YAML have simple "to python dictionary" methods
      • But don't stream
      • YAML a lot easier for humans
    • [DK] vtk-new-package also changes parameter names, creates enumerations
      • intermediate schema needs to be extensible
      • packages will want to store there specific infos for compute() method generation
      • also might have specs-altering info, like matplotlib's alternateSpec
    • representation to code , registry already has schema for some aspects
    • [RR] We might want to see if Module subclasses can be created lazily
      • no need to create all the classes just to register them in the registry and never actually use most of them
      • future effort
  • [RR] Where should VisTrails packages live?
    • tej installs as 'vistrailspkg.tej', TE installed it as 'userpackages.tej'
    • Currently, standard packages are 'vistrails.packages.', user packages are 'userpackages.' and packages loaded through pkg_resources might be anything
    • [RR] Use 'vistrailspkg.' everywhere?
    • Long-term effort to simplify package distribution/installation (and have VisTrails get them automatically?)

January 28, 2015

Updates

  • T. Caswell to come visit on Fri 6 to discuss wrapping work

Items to Discuss

  • [TE] New VTK wrapping
    • Current code by DK seems a good deal faster
    • Generates XML that can be patched/tweaked, generates Python code from it, VisTrails only loads generated Python code
    • RR would rather have VisTrails load intermediate representation (= XML) directly, wants to make sure this is not slower
    • The goal is to turn the intermediate step into something generic that would be used for every wrapped package (vtk, numpy, matplotlib, sklearn, java) instead of each having its own
    • TC has its own code at github:VTTools which parses numpy docstrings and generates modules, doesn't yet handle classes or persist anything
  • Web crawler
    • Right now, TE starts jobs for "start crawler", "stop crawler", "install classifier"
    • RR would rather have the crawling be a job as far as VisTrails and tej are concerned
    • The whole thing would be one pipeline: load examples, train classifier, start crawler [check for job, kill previous one, upload model, start processes], get snapshot, visualize
      • Need some support in tej and job submission system: long-running jobs, stop a job (wait for it to finish?), restart a job even though results are cached

January 21, 2015

Updates

Items to Discuss

  • [RR] Unified wrapping method discussion #991
    • TE to work on reusable method with intermediate representation, starting with VTK
  • [RR] Examples for scikit-learn: JF has an old example using Weka with parameter exploration (not currently in source tree)
    • AM's examples are enough
  • [AM] scikit-learn package is done, merge it in? #955
    • RR will merge
  • [RR] What should copyright headers say? #994
    • Let's keep everything in there: Utah/Poly/NYU

January 14, 2015

Updates

  • [TE] Working on classifier
  • [RR] Scripting integration, work in progress

Items to Discuss

  • [RR] Unified wrapping method discussion (#991)
    • Let's talk next week, [AM] and [DK] are not here

January 7, 2015

Updates

Items to Discuss

  • make sure that we address critical issues, questions, and pending review branches in a timely manner
  • scripting support
    • [RR] no issues if we want to just keep annotations in the generated code to allow the link back to a workflow
    • [RR] can translate from workflow to script, working on script to workflow
    • will work for parameter value changes, structural changes require changes to the annotations
    • need to publish best practices here
    • would be cool to do looping in scripts (easier interface than with workflows)
  • notebook support (convert form notebook to workflow)
    • RR will sync with FC on this
  • Issue with console in built-from-scratch
    • [TC] iPython rearranged some of the completion stuff in 2.2 and 2.3
    • binary has old version of iPython -> 1.0.0, should we update?
  • [TC] automated wrapping of numpy and scipy
    • discovered a bunch of malformed documentation in numpy and scipy
    • has github repo for vistrails tools
    • example modules wrap a bunch of R stuff (not baked in, just how things are)
    • will be pushing wrapping logic up
    • port names forbidden (window and domain)
    • have an import hook to get from yaml directly to VisTrails Modules
    • should work for any python modules with well-formed numpy docstrings.
  • [Action] should make it clear in documentation that Constant now means serializable not that the value doesn't change (e.g. List)
  • [TC] might be interesting to try to build components of matplotlib and accumulate in figure (long-term project, but thinking about how this might work)
  • [TE] build and build scripts
    • completely automatic, buildbot
    • need to set the build machines for the environment we want for the binary
    • would virtualenv work here?
    • [TC] anaconda can pin versions, potential path to test different configurations
    • Q: upload nightly binary builds? A: makes sense, make sure they are well-labeled
  • sourceforge stats: e.g. http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07
  • package issues (see Remi's message)
  • [TE] Scope of tej
    • Support single ssh commands?
    • Queue can be used as a remote machine (crawler is using queue.call*)
  • SourceForge stats: http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07