Difference between revisions of "DataVis2012"

From VistrailsWiki
Jump to navigation Jump to search
 
(8 intermediate revisions by the same user not shown)
Line 3: Line 3:
This page contains information on the Data Visualization course taught by [http://vgc.poly.edu/~csilva Professor Cláudio Silva] during Spring 2012 in the Polytechnic Institute of NYU.  
This page contains information on the Data Visualization course taught by [http://vgc.poly.edu/~csilva Professor Cláudio Silva] during Spring 2012 in the Polytechnic Institute of NYU.  


This class meets on Mondays nights, exact times & rooms TBA.
This class meets on Mondays nights, 6-8:25pm, location TBD.


== Course Overview ==  
== Course Overview ==  


Scientific- (or data-), and Information visualization are branches
Computing, in its many forms, has been an enormous accelerator for
of computer graphics and user interface design that are concerned with
science, leading to an information explosion in many different
presenting data to users, by means of images. The goal of this area is
fields. As Moore's law and other advances in technology increases our
usually to improve understanding of the data being presented. (From
capacity for acquiring, storing, and generating information, our
Wikipedia.)
ability to analyze these vasts amount of data with existing techniques
 
and tools is simply not keeping up.  Simply speaking, future
While, it is difficult to exactly define the field of visualization,
scientific advances depend on our ability to comprehend the vast
it is much easier to explain where the need for this field of study
amounts of data currently being produced and acquired. Effectively
comes from.  Computing, in its many forms, has been an enormous
understand and leverage the growing wealth of scientific data is one
accelerator for science, leading to an information explosion in many
of the greatest research challenges of the 21st century.
different fields. As Moore's law and other advances in technology
increases our capacity for acquiring, storing, and generating
information, our ability to analyze these vasts amount of data with
existing techniques and tools is simply not keeping up.  Simply
speaking, future scientific advances depend on our ability to
comprehend the vast amounts of data currently being produced and
acquired. Effectively understand and leverage the growing wealth of
scientific data is is one of the greatest research challenges of the
21st century.


There have been estimates of the amount of data being produced and
There have been estimates of the amount of data being produced and
Line 31: Line 22:
big bang". There are estimates that sometime in 2006, the human race
big bang". There are estimates that sometime in 2006, the human race
has generated more data in that one year than in all the 40,000 years
has generated more data in that one year than in all the 40,000 years
before. It is hard to imagine how this is the case, but just consider
before. In this course, we will be concerned with techniques for
the amount of data generated by CT and MRI scans, labpratory tests, and
analyzing information and scientific data. We take the view that
data entries of a single major health center in the United States. Or
future advances in science and engineering depend on the ability to
consider the amount of data being generated by the London video
comprehend the vast amounts of data being produced and
surveillance system; or the U.S. National Security Agency electronic
foreign surveillance initiative.  Even the personal data that each of
us receives is quite substantial. For starters, think of your e-mail,
it is probably a few gigabytes each year; add all your photos and
videos; etc. These are the things you are aware of; but think of all
the "traces" of yourself that you are leaving behind, all your
Google searches, Yahoo! instant messages, credit card transactions,
phone calls, cookies and other information spread at each and every
website you visit. This data add up really quickly, and being able to
analyze it becomes increasingly difficult.
 
In this course, we will be concerned with techniques for analyzing
information and scientific data. We would like to emphasize that
although the term "visualization" is somewhat recent, generally
accepted to being coined for the 1987 NSF report on scientific
visualization, the "area" of visualization in the sense of "data
understanding by visual representation or other visual means" can be
considered hundreds of years old. What separates the old from the new
is the availability of advanced computing capabilities, including
modern computer graphics techniques, which form the backbone of modern
visualization research.
 
We take the view that future advances in science depend on the ability
to comprehend the vast amounts of data being produced and
acquired. Visualization is a key enabling technology in this endeavor,
acquired. Visualization is a key enabling technology in this endeavor,
it helps people explore and explain data through software systems that
it helps people explore and explain data through software systems that
Line 64: Line 31:
premise of visualization is that visual information can be processed
premise of visualization is that visual information can be processed
at a much higher rate than raw numbers and text--as the cliche goes:
at a much higher rate than raw numbers and text--as the cliche goes:
"A picture is worth a thousand words".
"A picture is worth a thousand words".  


Despite the promise that visualization can serve as an effective
Despite the promise that visualization can serve as an effective
Line 72: Line 39:
of existing techniques, and how they relate to human
of existing techniques, and how they relate to human
cognition. Although there have been enormous advances in the area, the
cognition. Although there have been enormous advances in the area, the
use of advanced visualization techniques is still limited.
use of advanced visualization techniques is still limited.  


In this class, we will cover the principles and techniques necessary to generate these visualizations.  
In this class, we will cover the principles, techniques, and tools
necessary to generate these visualizations.


There will be no required textbook. We will be providing a detailed set of course notes for the class.
There will be no required textbook. We will be providing a detailed set of course notes for the class.
Line 84: Line 52:
== Course History ==  
== Course History ==  


This course builds on the Visualization course taught at Utah for many years, that can be traced to material developed by Professors Chris Johnson, Chuck Hansen, Ross Whitaker, among others. [http://www.vistrails.org/index.php/SciVisFall2007] and [http://www.vistrails.org/index.php/SciVisFall2008] are two previous editions of this course taught at the University of Utah.
This course builds on the Visualization course taught at Utah for many years, with contributions by Professors Chris Johnson, Chuck Hansen, Ross Whitaker, among others. [http://www.vistrails.org/index.php/SciVisFall2007] and [http://www.vistrails.org/index.php/SciVisFall2008] are two previous editions of this course taught at the University of Utah.


The NYU-Poly offering is being revamped to include more material on information visualization, and a project for graduate students.
The NYU-Poly offering is being revamped to include more material on information visualization, and a project for graduate students.
Line 107: Line 75:


We are likely to hold optional classes on Python, CMake, and VisTrails. Those will be discussed and announced in class.
We are likely to hold optional classes on Python, CMake, and VisTrails. Those will be discussed and announced in class.
== Projects ==
[http://www.vistrails.org/index.php/DataVis2012/Projects Go to projects]


== Datasets ==
== Datasets ==


We will the datasets used in the course here.
We will list the datasets used in the course here.


== Reading ==
== Reading ==
Line 128: Line 100:
[http://diveintopython.org/toc/index.html Dive Into Python]
[http://diveintopython.org/toc/index.html Dive Into Python]


[http://www.kitware.com/products/vtkguide.html VTK User's Guide]
[http://paraview.org/Wiki/ParaView/Users_Guide/Table_Of_Contents ParaView User's Guide]
 
[http://www.kitware.com/products/vtkguide.html VTK User's Guide] (Optional, this is a link to buy the book)


== Assignments ==
== Assignments ==


Assignments will be listed here.
Assignments will be listed here.
Please note the CSE departmental policy on collaboration on programming assignments: http://cis.poly.edu/policies/


== Late Assignments ==
== Late Assignments ==
Line 150: Line 126:
http://vgc.poly.edu/mailman/listinfo/datavis-course
http://vgc.poly.edu/mailman/listinfo/datavis-course


The datavis-course-teach [@vgc.poly.edu] is how you should interact with the instructor staff. Please do not send mail to personal addresses.
The datavis-course-teach [@vgc.poly.edu] is how you should interact with the instructor staff. Please do not send mail to personal email addresses.

Latest revision as of 02:10, 5 March 2012

[WORK IN PROGRESS, final version will be available when the semester starts]

This page contains information on the Data Visualization course taught by Professor Cláudio Silva during Spring 2012 in the Polytechnic Institute of NYU.

This class meets on Mondays nights, 6-8:25pm, location TBD.

Course Overview

Computing, in its many forms, has been an enormous accelerator for science, leading to an information explosion in many different fields. As Moore's law and other advances in technology increases our capacity for acquiring, storing, and generating information, our ability to analyze these vasts amount of data with existing techniques and tools is simply not keeping up. Simply speaking, future scientific advances depend on our ability to comprehend the vast amounts of data currently being produced and acquired. Effectively understand and leverage the growing wealth of scientific data is one of the greatest research challenges of the 21st century.

There have been estimates of the amount of data being produced and stored by the human race that support this notion of an "information big bang". There are estimates that sometime in 2006, the human race has generated more data in that one year than in all the 40,000 years before. In this course, we will be concerned with techniques for analyzing information and scientific data. We take the view that future advances in science and engineering depend on the ability to comprehend the vast amounts of data being produced and acquired. Visualization is a key enabling technology in this endeavor, it helps people explore and explain data through software systems that provide a static or interactive visual representation. A basic premise of visualization is that visual information can be processed at a much higher rate than raw numbers and text--as the cliche goes: "A picture is worth a thousand words".

Despite the promise that visualization can serve as an effective enabler of advances in other disciplines, the application of visualization technology is non-trivial. The design of effective visualizations is a complex process that requires deep understanding of existing techniques, and how they relate to human cognition. Although there have been enormous advances in the area, the use of advanced visualization techniques is still limited.

In this class, we will cover the principles, techniques, and tools necessary to generate these visualizations.

There will be no required textbook. We will be providing a detailed set of course notes for the class.

For the assignments, we will be using a variety of systems, including ParaView, VisTrails, VTK, matplotlib, and custom code developed for this class.

Besides the assignments, there will be a midterm, a final, and (for graduate students) a project.

Course History

This course builds on the Visualization course taught at Utah for many years, with contributions by Professors Chris Johnson, Chuck Hansen, Ross Whitaker, among others. [1] and [2] are two previous editions of this course taught at the University of Utah.

The NYU-Poly offering is being revamped to include more material on information visualization, and a project for graduate students.

Lectures, and consulting hours

We will meet once a week on Mondays.

The instructor for the class is Claudio Silva.

The TA for the course is TBD.

Silva office hours: TBD.

TA office hours: TBD.

Please post your questions to datavis-course-teach [@vgc.poly.edu].

Schedule

Schedule

We are likely to hold optional classes on Python, CMake, and VisTrails. Those will be discussed and announced in class.

Projects

Go to projects

Datasets

We will list the datasets used in the course here.

Reading

The class wiki page will contain up-to-date notes that reflect the material covered in class. We will also add pointers to supplementary material.

In the tentative schedule, there are hints on what to read before attending the class.

Tips for converting VTK pipelines

Reference Material

VisTrails User's Guide

Matplotlib User’s Guide

Dive Into Python

ParaView User's Guide

VTK User's Guide (Optional, this is a link to buy the book)

Assignments

Assignments will be listed here.

Please note the CSE departmental policy on collaboration on programming assignments: http://cis.poly.edu/policies/

Late Assignments

Assignments will not be accepted late. Students will be given a one-time two-day exemption for an unexpected event.

Grading

Your grade will be a combination of assignments, midterm and final.

Mailing List

There are two mailing lists for this class.

The datavis-course [@vgc.poly.edu] mailing list is the general student list for the course. You can sign up for it here:

http://vgc.poly.edu/mailman/listinfo/datavis-course

The datavis-course-teach [@vgc.poly.edu] is how you should interact with the instructor staff. Please do not send mail to personal email addresses.