Difference between revisions of "DataVis2012"

[WORK IN PROGRESS, final version will be available when the semester starts]

This page contains information on the Data Visualization course taught by Professor Cláudio Silva during Spring 2012 in the Polytechnic Institute of NYU.

This class meets on Mondays nights, exact times & rooms TBA.

Course Overview

``Scientific- (or data-), and Information visualization are branches of computer graphics and user interface design that are concerned with presenting data to users, by means of images. The goal of this area is usually to improve understanding of the data being presented. (From Wikipedia.)

While, it is difficult to exactly define the field of visualization, it is much easier to explain where the need for this field of study comes from. Computing, in its many forms, has been an enormous accelerator for science, leading to an information explosion in many different fields. As Moore's law and other advances in technology increases our capacity for acquiring, storing, and generating information, our ability to analyze these vasts amount of data with existing techniques and tools is simply not keeping up. Simply speaking, future scientific advances depend on our ability to comprehend the vast amounts of data currently being produced and acquired. Effectively understand and leverage the growing wealth of scientific data is is one of the greatest research challenges of the 21st century.

There have been estimates of the amount of data being produced and stored by the human race that support this notion of an ``information big bang. There are estimates that sometime in 2006, the human race has generated more data in that one year than in all the 40,000 years before. It is hard to imagine how this is the case, but just consider the amount of data generated by CT and MRI scans, labratory tests, and data entries of a single major health center in the United States. Or consider the amount of data being generated by the London video surveilance system; or the U.S. National Security Agency electronic foreign surveilance initiative. Even the personal data that each of us receives is quite substantial. For starters, think of your e-mail, it is probably a few gigabytes each year; add all your photos and videos; etc. These are the things you are aware of; but think of all the ``traces of yourself that you are leaving behind, all your Google searches, Yahoo! instant messages, credit card transactions, phone calls, cookies and other information spread at each and every website you visit. This data add up really quickly, and being able to analyze it becomes increasingly difficult.

In this course, we will be concerned with techniques for analyzing information and scientific data. We would like to emphasize that although the term ``visualization is somewhat recent, generally accepted to being coined for the 1987 NSF report on scientific visualization, the ``area of visualization in the sense of ``data understanding by visual representation or other visual means can be considered hundreds of years old. What separates the old from the new is the availability of advanced computing capabilities, including modern computer graphics techniques, which form the backbone of modern visualization research.

We take the view that future advances in science depend on the ability to comprehend the vast amounts of data being produced and acquired. Visualization is a key enabling technology in this endeavor, it helps people explore and explain data through software systems that provide a static or interactive visual representation. A basic premise of visualization is that visual information can be processed at a much higher rate than raw numbers and text--as the clich\'e goes: ``A picture is worth a thousand words.

Despite the promise that visualization can serve as an effective enabler of advances in other disciplines, the application of visualization technology is non-trivial. The design of effective visualizations is a complex process that requires deep understanding of existing techniques, and how they relate to human cognition. Although there have been enormous advances in the area, the use of advanced visualization techniques is still limited.

In this class, we will cover the principles and techniques necessary to generate these visualizations.

There will be no required textbook. Kitware's VTK User's Guide might be useful. We will be providing a detailed set of course notes for the class.

For the assignments, we will be using a variety of systems, including ParaView, VisTrails, VTK, and matplotlib in this class.

Besides the assignments, there will be a midterm, a final, and (for graduate students) a project.

Course History

This course builds on the Visualization course taught at Utah for many years.

http://www.vistrails.org/index.php/SciVisFall2007 and http://www.vistrails.org/index.php/SciVisFall2008 are two previous editions of this course taught at the University of Utah.

The NYU-Poly offering is being revamped to include more material on information visualization, and a project for graduate students.

Lectures, and consulting hours

We will meet once a week on Monday.

The instructor for the class is Claudio Silva.

The TA for the course is TBD.

Silva office hours: TBD.

TA office hours: TBD.

Please post your questions to datavis-course-teach [@vgc.poly.edu].

Schedule

We are likely to hold optional classes on Python, CMake, and VisTrails. Those will be discussed and announced in class.

Datasets

We will the datasets used in the course here.

Reading

The class wiki page will contain up-to-date notes that reflect the material covered in class. We will also add pointers to supplementary material.

In the tentative schedule, there are hints on what to read before attending the class.

Tips for converting VTK pipelines

Reference Material

VisTrails User's Guide

Matplotlib User’s Guide

Dive Into Python

VTK User's Guide

Assignments

Assignments will be listed here.

Late Assignments

Assignments will not be accepted late. Students will be given a one-time two-day exemption for an unexpected event.

Grading

Your grade will be a combination of assignments, midterm and final.

Mailing List

There are two mailing lists for this class.

The datavis-course [@vgc.poly.edu] mailing list is the general student list for the course. You can sign up for it here:

http://vgc.poly.edu/mailman/listinfo/datavis-course

The datavis-course-teach [@vgc.poly.edu] is how you should interact with the instructor staff. Please do not send mail to personal addresses.

@@ Line 109: / Line 109: @@
 We are likely to hold optional classes on Python, CMake, and VisTrails. Those will be discussed and announced in class.
+== Datasets ==
+We will the datasets used in the course here.
 == Reading ==