Assignment 2 - Data Exploration using SQL
Assignment Description
In your first assignment, you explored MTA data about subway fares using a data exploration tool. Now, you will further explore this data set using SQL.
You will use MySQL. Please follow the instructions in the lab notes to install the system.
You will work with the following tables:
create table fares_jan18(
remote varchar(10),
station varchar(100),
ff int,
sendis int,
7d int,
30d int,
students int);
create table fares_feb1(
remote varchar(10),
station varchar(100),
ff int,
sendis int,
7d int,
30d int,
students int);
create table stations(
name varchar(100),
lat float,
long float,
line varchar(50),
lines varchar(50));
To create these tables and populate them, first download the dump file from: http://vgc.poly.edu/~juliana/courses/BigData2014/Assignments/2-SQL/mta.sql Then, from a terminal window invoke MySQL and issue the following commands:
> use test;
> source mta.sql
If you are not running MySQL from the same directory where the file mta.sql is located, you need to specify the full path for the file.
You are encouraged to use the Web as a resource to find more information about MySQL.
You can exchange ideas with your classmates, but the work you submit should be your own. Copying is not allowed.
Submission Instructions
You will submit to NYU Classes two text files:
- as2.sql -- containing the queries you will write to answer the questions below. Use the following template for this file:
-- remember to end each query with a semi-colon
use test;
-- ********************Q1*******************
SELECT 'Q1' AS ' ';
<insert your query Q1>;
-- ********************Q2*******************
SELECT 'Q2' AS ' ';
<insert your query Q1>;
....
- as2.log -- containing the answers to you queries
Queries
- 1. How many subway stations are there in New York City?
- 2. List all stations in the Broadway Line.
- 3. List the latitude, longitude and number of full-fare tickets sold during the week of Jan 18 at all stations in the Broadway line in decreasing order of number of tickets.
- 4. List the difference between the number of full-fare tickets sold at the different stations of the Broadway line between Jan 18 and Feb 1.
- 5. List the difference between the number of 7-day and 30-day tickets sold at the different stations in the Broadway line between on Jan 18 and Feb 1.
- 6. Which station had the largest increase in the number of full-fare tickets sold between Jan 18 and Feb 1?
- 7. Which station had the largest decrease in the number of full-fare tickets sold between Jan 18 and Feb 1?
- 8. List the names of the stations whose decrease in the number of full-fare tickets sold between Jan 18 and Feb 1 is greater than 1000.
- 9. What was the average difference in the number of full-fare tickets sold between Jan 18 and Feb 1 across the stations of the Broadway line?
- 10. How many full-fare tickets were sold in NYC in the week of Jan 18?
- 11. List all the stops of the F train.