Difference between revisions of "Course: Big Data 2016"

From VistrailsWiki
Jump to navigation Jump to search
Line 113: Line 113:
** See quizzes on [http://www.newgradiance.com Gradiance] -- Distance measures and document similarity.
** See quizzes on [http://www.newgradiance.com Gradiance] -- Distance measures and document similarity.


== Week 11 - April 4th: Large-Scale Visualization -- -- Invited lecture by Professor Claudio Silva ==
== Week 11 - April 4th: Large-Scale Visualization -- Invited lecture by Professor Claudio Silva ==


* Lecture notes:
* Lecture notes:
Line 129: Line 129:
**http://vgc.poly.edu/~juliana/courses/BigData2016/Lectures/visualization/movies/SevereTstorm.mov
**http://vgc.poly.edu/~juliana/courses/BigData2016/Lectures/visualization/movies/SevereTstorm.mov


== Week 12 - April 11th: Exploring Spatio-Temporal Data -- Invited lecture by Dr. Harish Doraiswamy (NYU CDS) ==
== Week 12 - April 11th: Visualization: Using D3 --  Invited lecture by Bowen Yu ==
 
* Lecture notes and lab:
** http://vgc.poly.edu/~juliana/courses/BigData2016/Lectures/vis-d3.pdf
 
 
== Week 13 - April 18th: Data Cleaning - Invited lecture by Dr. Divesh Srivastava, AT&T Research ==
 
 
== Week 14 - April 25th: Exploring Spatio-Temporal Data -- Invited lecture by Dr. Harish Doraiswamy (NYU CDS) ==


* Lecture notes:
* Lecture notes:
Line 137: Line 146:
** https://github.com/ViDA-NYU/aws_taxi
** https://github.com/ViDA-NYU/aws_taxi


== Week 13 - April 18th: Data Cleaning - Invited lecture by Dr. Divesh Srivastava, AT&T Research ==




== Week 14 - April 25th: Association Rules  ==
 
== Week 15 - May 2: Association Rules  ==


* Lecture notes:
* Lecture notes:
Line 155: Line 164:
* Homework Assignment
* Homework Assignment
** See quiz on [http://www.newgradiance.com Gradiance] -- Association Rules.
** See quiz on [http://www.newgradiance.com Gradiance] -- Association Rules.
== Week 15 - May 2:  Graph Analysis ==
* '''Lecture notes:''' http://vgc.poly.edu/~juliana/courses/BigData2015/Lectures/graph-algos.pdf
* Required Reading: Data-Intensive Text Processing with MapReduce. Chapters 5 -- Graph Algorithms





Revision as of 20:36, 11 April 2016

DS-GA 1004- Big Data: Tentative Schedule -- subject to change

  • TAs:
    • Yuan Feng
    • Kevin Ye
  • Lecture: Mondays, 4:55pm-7:35pm at Silver 207
  • Some classes will include a lab session, please always bring your laptop.

News

Week 1 - Jan 25: Course Overview

Week 2 - Feb 1: The evolution of Data Management and introduction to Big Data; Introduction to Databases and Relational Model

Week 3 - Feb 8: Introduction to Databases, Relational Model and SQL (cont.)

Week 4 - Feb 15: Holiday

Transparency and Reproducibility (1 week)

Week 5 - Feb 22: Data Exploration and Reproducibility

Big Data Foundations and Infrastructure (3 weeks)

Week 6 - Feb 29: Introduction to Map Reduce

Week 7 - March 7: MapReduce Algorithm Design Patterns

Week 8-- March 14th: Spring Break

Week 9- March 21st: Parallel Databases vs MapReduce; Storage Solutions; Introduction to SPARK

Big Data Algorithms, Mining Techniques, and Visualization (6 weeks)

Week 10 - March 28th: Finding similar items & Spark

  • Homework Assignment
    • See quizzes on Gradiance -- Distance measures and document similarity.

Week 11 - April 4th: Large-Scale Visualization -- Invited lecture by Professor Claudio Silva


Week 12 - April 11th: Visualization: Using D3 -- Invited lecture by Bowen Yu


Week 13 - April 18th: Data Cleaning - Invited lecture by Dr. Divesh Srivastava, AT&T Research

Week 14 - April 25th: Exploring Spatio-Temporal Data -- Invited lecture by Dr. Harish Doraiswamy (NYU CDS)



Week 15 - May 2: Association Rules


  • Suggested additional reading:
    • Fast algorithms for mining association rules, Agrawal and Srikant, VLDB 1994.
    • Data Mining Concepts and Techniques, Jiawei Han and Micheline Kamber, Morgan Kaufmann
    • Dynamic Itemset Counting and Implication Rules for Market Basket Data. Brin et al., SIGMOD 1997. http://www-db.stanford.edu/~sergey/dic.html
  • Homework Assignment


Week 16 - May 9: Final Exam

Week 17 - May 16: Project Presentations