Difference between revisions of "Assignment 1 - Data Exploration"
Line 2: | Line 2: | ||
During our lab, we explored MTA data about subway fares. For your assignment, you will further explore this data set and try to find at least 4 ''interesting'' facts/observations. Use your creativity! | During our lab, we explored MTA data about subway fares. For your assignment, you will further explore this data set and try to find at least 4 ''interesting'' facts/observations. Use your creativity! | ||
You will use [http://www.vistrails.org VisTrails] for this assignment, and you can start from the [http://vgc.poly.edu/~juliana/courses/BigData2014/Assignments/1-DataAnalysis/mta-plots.vt example we used in the lab]. You can find more information about VisTrails in the Users' Guide. | You will use [http://www.vistrails.org VisTrails] for this assignment, and you can start from the [http://vgc.poly.edu/~juliana/courses/BigData2014/Assignments/1-DataAnalysis/mta-plots.vt example we used in the lab]. You can find more information about the data at http://www.vistrails.org/index.php/Lab_notes_02/06/14 and about VisTrails in the [http://www.vistrails.org/usersguide/v2.1/html Users' Guide]. | ||
And you are encouraged to use the Web as a resource to find more information about the different packages you will use (e.g., matplotlib) as well as to find additional data that might be interesting to integrate with the fare data. | And you are encouraged to use the Web as a resource to find more information about the different packages you will use (e.g., matplotlib) as well as to find additional data that might be interesting to integrate with the fare data. |
Latest revision as of 06:37, 9 February 2014
Assignment Description
During our lab, we explored MTA data about subway fares. For your assignment, you will further explore this data set and try to find at least 4 interesting facts/observations. Use your creativity!
You will use VisTrails for this assignment, and you can start from the example we used in the lab. You can find more information about the data at http://www.vistrails.org/index.php/Lab_notes_02/06/14 and about VisTrails in the Users' Guide.
And you are encouraged to use the Web as a resource to find more information about the different packages you will use (e.g., matplotlib) as well as to find additional data that might be interesting to integrate with the fare data.
You can exchange ideas with your classmates, but the work you submit should be your own. Copying is not allowed.
Submission Instructions
You will submit the vt file containing the trail of your analysis to NYU Classes. Some guidelines you should follow:
- The pipelines that correspond to the interesting facts you discover should be tagged using the following convention: Fact <number>. For example, Fact 1, Fact 2, etc. You can set the tag on the left pane in the History view (see screenshot below).
- You should add notes to these pipelines explaining your findings. The notes field is located below the tag.
- Make sure your pipelines are portable, i.e., I should be able to run them on my own machine. For example, you should avoid using files stored in your local file system.