Difference between revisions of "CS6093/Lectures"
Line 63: | Line 63: | ||
=== Assignment === | === Assignment === | ||
* Write 2 position papers | * Write 2 position papers --- one for each of the articles in the required reading for this week (see below) | ||
Line 118: | Line 118: | ||
* [http://www.christof-strauch.de/nosqldbs.pdf NoSQL Databases.] Christof Strauch. 2010. | * [http://www.christof-strauch.de/nosqldbs.pdf NoSQL Databases.] Christof Strauch. 2010. | ||
For additional suggested readings, see http://www.vistrails.org/index.php?title=CS6093/Selected_Papers_and_Topics | For additional suggested readings, see http://www.vistrails.org/index.php?title=CS6093/Selected_Papers_and_Topics | ||
== Week 6 - Feb 28 == | == Week 6 - Feb 28 == | ||
Line 125: | Line 125: | ||
== Week 7 - March 6 == | == Week 7 - March 6 == | ||
* NoSQL Databases | |||
=== Assignment === | |||
* Write a position papers for the required papers | |||
===Required Reading === | ===Required Reading === | ||
Line 131: | Line 136: | ||
* [http://cs-www.cs.yale.edu/homes/dna/papers/split-execution-hadoopdb.pdf Efficient Processing of Data Warehousing Queries in a Split Execution Environment.] Bajda-Pawlikowsk et al., SIGMOD 2011 | * [http://cs-www.cs.yale.edu/homes/dna/papers/split-execution-hadoopdb.pdf Efficient Processing of Data Warehousing Queries in a Split Execution Environment.] Bajda-Pawlikowsk et al., SIGMOD 2011 | ||
For additional suggested readings, see http://www.vistrails.org/index.php?title=CS6093/Selected_Papers_and_Topics | |||
== Week 8 - March 13 == | == Week 8 - March 13 == | ||
Line 140: | Line 147: | ||
== Week 10 - March 27 == | == Week 10 - March 27 == | ||
* Web information integration | |||
=== Assignment === | |||
* Write a position papers for the required papers | |||
===Required Reading === | |||
* [http://pages.cs.wisc.edu/~anhai/papers/imap.pdf iMAP: Discovering Complex Semantic Matches between Database Schemas.] R. Dhamanka, Y. Lee, A. Doan, A. Halevy, and P. Domingos. SIGMOD-2004. | |||
* [http://portal.acm.org/citation.cfm?id=1132863.1132872&coll=GUIDE&dl=GUIDE Automatic complex schema matching across Web query interfaces] Bin He, Kevin Chuan Chang, ACM Trans. Database Syst. 2006 | |||
=== Additional Reading === | |||
* [http://portal.acm.org/citation.cfm?id=767154 A survey of approaches to automatic schema matching] Rahm Erhard and Bernstein Philip, VLDB 2001 | |||
== Week 11 - April 3 == | == Week 11 - April 3 == | ||
== Week 12 - April 10 == | == Week 12 - April 10 == |
Revision as of 19:17, 13 February 2012
Make sure to check my.poly.edu for course announcements
"""Every week, you must write position papers for the papers in the Required Readings list"""
Week 1 - Jan 24
- Course overview (First day of classes!)
http://vgc.poly.edu/~juliana/courses/cs6093/Lectures/lecture1.pdf
- Provenance and Workflows
http://vgc.poly.edu/~juliana/courses/cs6093/Lectures/provenance-workflows.pdf
Readings
- Provenance and Scientific Workflows: Challenges and Opportunities Susan Davidson and Juliana Freire. In Proceedings of ACM SIGMOD International Conference on Management of Data, 2008. Tutorial resources
- Provenance for Computational Tasks: A Survey Juliana Freire, David Koop, Emanuele Santos, and Claudio T. Silva. In IEEE Computing in Science & Engineering, 2008.
- Querying and Creating Visualizations by Analogy. Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire and Claudio T. Silva. IEEE Transactions on Visualization and Computer Graphics, 13(6), pp. 1560-1567, 2007. Best paper in IEEE Visualization 2007.
Week 2 - Jan 31
- Provenance and Workflows (cont.)
http://vgc.poly.edu/~juliana/courses/cs6093/Lectures/provenance-workflows.pdf
- Discussion about literature search
Readings
same as last week
Week 3 - Feb 7
- Information extraction: survey
http://vgc.poly.edu/~juliana/courses/cs6093/Lectures/information-extraction.pdf
Announcements
- The topic winners were: Information Extraction, Deep Web, Relational Data on the Web, Web Schema Matching, NoSQL DB, Provenance in DB, Graph Indexing, Usable query interfaces
- I will email to you preliminary assignments tomorrow
Assignment
- Write a position paper for the article: ONDUX: on-demand unsupervised learning for information extraction
Readings
- A survey of approaches to automatic schema matching Rahm Erhard and Bernstein Philip, VLDB 2001
- A Brief Survey of Web Data Extraction Tools. Alberto H. F. Laender, Berthier A. Ribeiro-Neto, Altigran Soares da Silva, Juliana S. Teixeira: SIGMOD Record 31(2): 84-93 (2002)
- ONDUX: on-demand unsupervised learning for information extraction. Eli Cortez, Altigran Soares da Silva, Marcos André Gonçalves, Edleno Silva de Moura: SIGMOD Conference 2010: 807-818
Some history and perspective:
- Data integration: the teenage years. A. Halevy, A. Rajaraman, J. Ordille. VLDB 2006.
- Generic Schema Matching, Ten Years Later. Philip A. Bernstein, Jayant Madhavan, Erhard Rahm: PVLDB 4(11): 695-701 (2011)
Week 4 - Feb 14
- Provenance and Databases
- Graph Indexing
Assignment
- Write 2 position papers --- one for each of the articles in the required reading for this week (see below)
Required Reading
- Peter Buneman, Sanjeev Khanna, Wang Chiew Tan: Why and Where: A Characterization of Data Provenance. ICDT 2001: 316-330
http://db.cis.upenn.edu/DL/whywhere.pdf
- Presenter: Fernando Seabra
- Rebuttal: Joe Miller (tentative)
- Graph Indexing: Tree + Delta >= Graph P. Zhao, J. X. Yu, and P. S. Yu. VLDB 2007.
- Presenter: Nivan Ferreira
- Rebuttal: Sergey Nepomnyachiy (tentative)
Additional Suggested Reading
- A. Das Sarma, M. Theobald, and J. Widom. LIVE: A Lineage-Supported Versioned DBMS. Proceedings of the 22nd International Conference on Scientific and Statistical Database Management, Heidelberg, Germany, June 2010.
http://ilpubs.stanford.edu:8090/926/1/versioning-TR.pdf
- Total Recall | Oracle Database
http://www.oracle.com/technetwork/database/focus-areas/storage/total-recall-whitepaper-171749.pdf
- Provenance in Databases: Past, Current, and Future W. Tan. IEEE Data Engineering Bulletin.
- Closure-Tree: An Index Structure for Graph Queries H. He and A. K. Singh. ICDE 2006.
- Answering pattern match queries in large graph databases via graph embedding
Lei Zou, Lei Chen, M. Tamer Özsu and Dongyan Zhao http://vgc.poly.edu/~juliana/courses/cs6093/Readings/graph-matching-vldbj2011
- Chenghui Ren, Eric Lo, Ben Kao, Xinjie Zhu, Reynold Cheng: On Querying Historical Evolving Graph Sequences. PVLDB 4(11): 726-737 (2011)
http://vgc.poly.edu/~juliana/courses/cs6093/Readings/evolving-graphs-vldb11.pdf
- Algorithmics and Applications of Tree and Graph Searching D. Shasha, J. T. L. Wang, and R. Giugno. PODS 2002.
Week 5 - Feb 21
- NoSQL databases
Assignment
- Write a position papers for the required papers
Required Reading
- Parallel data processing with MapReduce: a survey. Lee et al, SIGMOD Record 2011
http://vgc.poly.edu/~juliana/courses/cs6093/Readings/lee-sigrec2011.pdf
- Pig latin: a not-so-foreign language for data processing.C Olston, B Reed, U Srivastava, R Kuma, A. Tomkins. SIGMOD 2008.
Additional suggested reading
- SQL databases v. NoSQL databases. Michael Stonebraker, CACM 2010.
- NoSQL Databases. Christof Strauch. 2010.
For additional suggested readings, see http://www.vistrails.org/index.php?title=CS6093/Selected_Papers_and_Topics
Week 6 - Feb 28
TBD
Week 7 - March 6
- NoSQL Databases
Assignment
- Write a position papers for the required papers
Required Reading
- HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Avi Silberschatz, Alex Rasin. VLDB 2009.
- Efficient Processing of Data Warehousing Queries in a Split Execution Environment. Bajda-Pawlikowsk et al., SIGMOD 2011
For additional suggested readings, see http://www.vistrails.org/index.php?title=CS6093/Selected_Papers_and_Topics
Week 8 - March 13
Spring break - no class
Week 9 - March 20
TBD
Week 10 - March 27
- Web information integration
Assignment
- Write a position papers for the required papers
Required Reading
- iMAP: Discovering Complex Semantic Matches between Database Schemas. R. Dhamanka, Y. Lee, A. Doan, A. Halevy, and P. Domingos. SIGMOD-2004.
- Automatic complex schema matching across Web query interfaces Bin He, Kevin Chuan Chang, ACM Trans. Database Syst. 2006
Additional Reading
- A survey of approaches to automatic schema matching Rahm Erhard and Bernstein Philip, VLDB 2001
Week 11 - April 3
Week 12 - April 10
Week 13 - April 17
Week 14 - April 24
Week 15 - May 1
Project presentation