Persistence Package
Persistence Package Issues
Currently, VisTrails uses one git repository per user. To support a repository that is shared among multiple users, we first need to take the metadata from the local sqlite database and put it also in the git (or a central repository).
We considered using git notes as a means to store the metadata. Our hope was that git would manage the notes together with the pushes and pulls, but unfortunately, that is not how this works. But the notes functionality itself requires a separate pulling, pushing and merging, which seems to add unnecessary complexity.
Instead of using git notes, David proposed to store the metadata as an ordinary file in git. For instance, if "input.csv" is one of the files being tracked by the persistence package, along with this file, we would also have something like "input.csv_md", which would store its metadata (name, tags, user, date created, id, ...). But for the user to manage this metadata (e.g., edit tags or search for a specific file in the repository), we still need the sqlite database. Thus, we have two possible solutions so far:
- Use the solution proposed by David - when opening VisTrails (with the persistence package enabled), we could update the local repository (pull), and put all the metadata in a local sqlite database; when making modifications, the user can ask to do a push in the repository (and we can automatically do a push when VisTrails closes or the package is disabled); of course, the user could also ask to update his local repository again.
- Positive aspects:
- The solution allows users to work offline
- Negative aspects:
- Merging files can be an issue when pushing - we might have conflicts not only with the files, but also with the metadata, which would be probably not that easy to solve
- If the user has 1,000 files in the repository, than he would also have 1,000 metadata files
- Positive aspects:
- Along with the git, users would also have a centralized sqlite database; in this case, all users would access the same database, so all the metadata could be directly stored on and retrieved from it.
- Positive aspects:
- No need for additional files to store metadata
- No problems with merging metadata (the database would guarantee consistency)
- Negative aspects:
- Users could not work offline
- Positive aspects:
The idea is to implement one of the solutions to work with only one user - then, we would expand it to allow collaboration.