Difference between revisions of "Course: Big Data Analysis"

Revision as of 00:42, 3 September 2012

Make sure to check my.poly.edu for course announcements

Week 1: Monday Sept. 10th - Course Overview

Course overview (First day of classes!)
Student survey
Introduction to Big Data

Readings

Week 2: Monday Sept. 17th - Map-Reduce

Introduction to map-reduce
Introduction to Hadoop
Map-Reduce ecosystem: Pig, Hive, Jaql, Mahout, BigInsights

Readings

Week 3: Monday Sept. 24th - Statistics is easy

Guest lecture by Dennis Shasha
Statistics and Big Data

Readings

http://www.morganclaypool.com/doi/abs/10.2200/S00142ED1V01Y200807MAS001 -- book is available for free for NYU students
JF: add references for issues related to stats and big data

Week 4: Monday Oct. 1st - Databases and Big Data

Databases and Big Data

Readings

JF: ADD: NoSQL databases (reading papers from literature)

Column store vs. tuple store. HBase, MongoDB, VaultDB, Cassandra, HadoopDB (Facebook) Overview of different architectures, distributed databases vs. hadoop, transaction support...

Week 5: Monday Oct. 8st - Finding Similar Items

Overview of information integration

Readings

Mining of Massive Datasets, chapter 3; information integration; entity resolution

Week 6: Monday Oct. 15st

Reading: inverted index and crawling (Lin chapter 4)
Ask Torsten (tentative, ask him for reading material)

Readings

Mining of Massive Datasets, Chapter 5
Data-Intensive Text Processing with MapReduce, Chapter 5

Week 7: Monday Oct. 22st - Introduction to Visualization; Data stewardship and provenance

Guest lecture by Claudio Silva and Lauro Lins

Readings

Hellerstein (ask Claudio for additional references)
ADD: provenance and reproducibility

Week 8: Monday Oct. 29th - Graph Analysis

Graph algorithms, link analysis, social networks

Readings

Data-Intensive Text Processing with MapReduce, Chapter 4

Week 9: Monday Nov. 12th - Frequent Itemsets

Reading

Mining of Massive Datasets, Chapter 6

Week 10: Monday Nov. 5th - Mining Data Streams =

Readings

Mining of Massive Datasets, Chapter 4

Week 11: Monday Nov. 19th - Clustering

Readings

Mining of Massive Datasets, Chapter 7

Week 12: Monday Nov. 26th - Recommendation Systems

Readings

Mining of Massive Datasets, Chapter 9

Week 13 Monday Dec. 3rd - EM algorithms for text processing

Data-Intensive Text Processing with MapReduce, Chapter 6

@@ Line 57: / Line 57: @@
-== Week 6:  Monday Oct. 15st - Graph Analysis ==
+== Week 6:  Monday Oct. 15st ==
-* Graph algorithms, link analysis, social networks
+* Reading: inverted index and crawling (Lin chapter 4)
+* Ask Torsten (tentative, ask him for reading material)
 === Readings ===
@@ Line 73: / Line 74: @@
 * ADD: provenance and reproducibility
+== Week 8: Monday Oct. 29th - Graph Analysis==
-== Week 8: Monday Oct. 29th - TBD swap oct 15==
+* Graph algorithms, link analysis, social networks
-* Reading: inverted index and crawling (Lin chapter 4)
-* Ask Torsten (tentative, ask him for reading material)
 === Readings ===

Difference between revisions of "Course: Big Data Analysis"

Revision as of 00:42, 3 September 2012

Contents

Week 1: Monday Sept. 10th - Course Overview

Readings

Week 2: Monday Sept. 17th - Map-Reduce

Readings

Week 3: Monday Sept. 24th - Statistics is easy

Readings

Week 4: Monday Oct. 1st - Databases and Big Data

Readings

Week 5: Monday Oct. 8st - Finding Similar Items

Readings

Week 6: Monday Oct. 15st

Readings

Week 7: Monday Oct. 22st - Introduction to Visualization; Data stewardship and provenance

Readings

Week 8: Monday Oct. 29th - Graph Analysis

Readings

Week 9: Monday Nov. 12th - Frequent Itemsets

Reading

Week 10: Monday Nov. 5th - Mining Data Streams =

Readings

Week 11: Monday Nov. 19th - Clustering

Readings

Week 12: Monday Nov. 26th - Recommendation Systems

Readings

Week 13 Monday Dec. 3rd - EM algorithms for text processing

Week 14: Monday Dec. 10th - Project presentation

Other Readings

Navigation menu

Search