Big Data Lab notes 02/19/14
Today, we will work on SQL queries.
Before the class, please download mySQL from from http://dev.mysql.com/downloads. There are many different versions, select MySQL Community Server.
Installing and starting mySQL
mySQL has an extensive documentation, including installation instructions for different OSes, for example, MacOS: https://dev.mysql.com/doc/refman/5.7/en/macosx-installation.html and for Windows: http://www.mysql.com/why-mysql/windows/
Several students had trouble installing the MySQL Startup Item, do not install this option!
I run mySQL on MacOS, so here's what I did after I installed the system to 'start' the database:
shell> cd /usr/local/mysql
shell> sudo ./bin/mysqld_safe
(ENTER YOUR PASSWORD, IF NECESSARY)
(PRESS CONTROL-Z)
shell> bg
(PRESS CONTROL-D OR ENTER "EXIT" TO EXIT THE SHELL)
You can also use the shortcut command: mysqlup
To connect to the MySQL server, on a terminal, invoke /usr/local/mysql/bin/mysql.
You can add aliases to your shell's resource file to make it easier to access commonly used programs such as `mysql' and `mysqladmin' from the command line. I have added the following 2 lines to my .bashrc:
alias mysql=/usr/local/mysql/bin/mysql alias mysqladmin=/usr/local/mysql/bin/mysqladmin
You can also modified my PATH environment variable to include /usr/local/mysql/bin:
export PATH=$PATH:./:/usr/local/mysql/bin:/usr/local/mysql/bin:/usr/local/hadoop-1.2.1/bin:/usr/local/pig-0.12.0/bin
Our running example
We will work with three tables. Here are the table definitions:
create table Sailors(
sid int PRIMARY KEY,
sname varchar(30),
rating int,
age int);
create table Reserves(
sid int,
bid int,
day date,
PRIMARY KEY (sid, bid, day));
create table Boats(
bid int PRIMARY KEY,
bname char(20),
color char(10));
You can download the sample file with these statements as well as some sample data from http://vgc.poly.edu/~juliana/courses/BigData2014/Lectures/Lab-SQL/sailors-mysql.sql
SQL Cheat Sheet
SELECT AcctNo, Amount
FROM ATMWithdrawal
WHERE Amount < 50;
A SQL query is evaluated as follows:
- First the FROM clause tells us the input tables
- Second, the WHERE clause is evaluated for all possible combinations from the input tables
- Third, the SELECT clause tells us which attributes to keep in the query answer