Difference between revisions of "Course: Big Data Analysis"

From VistrailsWiki
Jump to navigation Jump to search
Line 69: Line 69:
* [http://research.google.com/pubs/pub36726.html Large-scale Incremental Processing Using Distributed Transactions and Notifications]
* [http://research.google.com/pubs/pub36726.html Large-scale Incremental Processing Using Distributed Transactions and Notifications]


== Week 4:  Monday Sept 30th - Query Processing on Mapreduce and High-level Languages ==
== Week 4:  Monday Sept 30th - ''Invited lecture by Dr. C. Mohan (IBM)''


* ''Invited lecture by Dr. C. Mohan (IBM)''
 
 
== Week 5: Monday Oct. 7th - Query Processing on Mapreduce and High-level Languages ==


* Pig Latin and Query Processing:  
* Pig Latin and Query Processing:  
Line 82: Line 84:
* Second edition of the book: http://www.morganclaypool.com/doi/pdf/10.2200/S00295ED1V01Y201009MAS008
* Second edition of the book: http://www.morganclaypool.com/doi/pdf/10.2200/S00295ED1V01Y201009MAS008


== Week 5: Monday Oct. 7th Invited Speaker: Torsten Suel ==


* Big Data and Information Retrieval. Invited lecture by Torsten Suel.
 
** Lecture notes: http://vgc.poly.edu/~juliana/courses/cs9223/Lectures/search-data.pdf




Line 92: Line 92:




== Week 7:  Monday Oct. 22st - Graph Algorithms ==
== Week 7:  Monday Oct. 22st - Invited Speaker: Torsten Suel ==
 
TODO
 
=== Readings ===
* [http://infolab.stanford.edu/pub/papers/google.pdf 1998 PageRank Paper]
* [http://lintool.github.com/MapReduceAlgorithms/MapReduce-book-final.pdf Data-Intensive Text Processing with MapReduce, Chapter 4 (Inverted Indexing for Text Retrieval) and 5(Graph Algorithms)]
* [http://infolab.stanford.edu/~ullman/mmds/ch5.pdf Mining of Massive Datasets, Chapter 5 (Link Analysis)]
* Pregel: A System for Large-Scale Graph Processing. Google. [http://kowshik.github.com/JPregel/pregel_paper.pdf]


* Big Data and Information Retrieval. Invited lecture by Torsten Suel.
** Lecture notes: http://vgc.poly.edu/~juliana/courses/cs9223/Lectures/search-data.pdf




Line 178: Line 172:




== Week 14: Monday Dec. 9th - Recommendation Systems ==
== Week 14: Monday Dec. 9th - - Graph Algorithms ==
 
TODO


=== Readings ===
=== Readings ===
* Ullman chapter 9
* [http://infolab.stanford.edu/pub/papers/google.pdf 1998 PageRank Paper]
 
* [http://lintool.github.com/MapReduceAlgorithms/MapReduce-book-final.pdf Data-Intensive Text Processing with MapReduce, Chapter 4 (Inverted Indexing for Text Retrieval) and 5(Graph Algorithms)]
* [http://infolab.stanford.edu/~ullman/mmds/ch5.pdf Mining of Massive Datasets, Chapter 5 (Link Analysis)]
* Pregel: A System for Large-Scale Graph Processing. Google. [http://kowshik.github.com/JPregel/pregel_paper.pdf]





Revision as of 21:37, 11 September 2013

Fall 2013

This schedule is tentative and subject to change

Make sure to check my.poly.edu for course announcements

Week 1: Monday Sept. 9th - Course Overview

Required Reading

Additional References

Week 2: Monday Sept. 16th - Map-Reduce/Hadoop

Required Reading

Additional References

Week 3: Monday Sept. 23rd - Data Management for Big Data

Related Topics

Required Reading

Additional References

== Week 4: Monday Sept 30th - Invited lecture by Dr. C. Mohan (IBM)


Week 5: Monday Oct. 7th - Query Processing on Mapreduce and High-level Languages

Required Reading



Week 6: Mon Oct. 14th - Fall Break - No class

Week 7: Monday Oct. 22st - Invited Speaker: Torsten Suel


Week 8: Monday Oct 28th- Statistics is easy - Invited Speaker: Dennis Shasha


Week 9: Monday Nov 5th - EM and Text Processing

TODO


Readings

  • Data-Intensive Text Processing with MapReduce, Chapter 6


Week 10: Monday Nov. 11th - Finding Similar Items and Information Integration

Required Reading

Homework Assignment

Due November 17th Your assignment is in http://www.newgradiance.com/services. Please see http://vgc.poly.edu/~juliana/courses/cs9223 for instructions on how to access this service.

Week 11: Monday Nov 18th- Frequent Itemsets

Required Reading

  • Mining of Massive Datasets, Chapter 4

Homework Assignment

Due November 24th

Additional Reading


Week 12: Monday Nov. 25th - Clustering

Homework Assignment

Due Dec 1st

Readings

Further Readings


Week 13: Monday Dec. 2nd - Invited lecture by Enrico Bertini

Readings

The Value of Visualization. IEEE Visualization 2005. Jarke J. van Wijk. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.1138

Visualization Analysis and Design: Principles, Methods, and Practice. Tamara Munzner (Book Draft 2 from Sep. 2012). http://www.cs.ubc.ca/~tmm/courses/533-11/book/vispmp-draft.pdf


Week 14: Monday Dec. 9th - - Graph Algorithms

TODO

Readings


Week 15 Monday Dec. 16th - Final Exam

Other topics

Provenance

Juliana Freire and Claudio Silva. In Computing in Science and Engineering 14(4): 18-25, 2012.

Juliana Freire, David Koop, Emanuele Santos, and Claudio T. Silva. In IEEE Computing in Science & Engineering, 2008.