DataVis2012/Projects/Hillegass
Overview: The goal of this project is to take tweets hashtagged with a specific college name (ie #Harvard) and find the most used words in the tweets about that college. This information will then be made into a word cloud. The word cloud will help prospective students learn about the atmosphere of each college.
Implementation: The first step will be to write a script that will pull out tweets associated with the college and then search for terms from a predetermined library of descriptors. The word clouds of colleges can then be compared to each other.
Another part of the project will be to use geotagged tweets from the college's surrounding area. The word clouds for the surrounding areas will be compared to the hashtags relating directly to the college to discover how the atmosphere of the surrounding area compares to the atmosphere of the college. It will also determine the relationship (how intertwined they are) of the surrounding area and the college.
The word clouds will be live updating (the tweets will be streamed in) and a log will be kept of the different word clouds to see if there is a steady evolution or if the outcome is random and thus unhelpful. I will also plot the data to analyze the graphs.
Notes/Ideas: The evolution of graphs may prove more reliable depending on the college. For example, the surrounding area tweets and the college tweets may be more intertwined for a school in a very rural area, such as Penn State, as opposed to in the city, such as NYU.
The evolution may be more steady over periods of time. For example, week to week may be unreliable, but season to season may be more consistent.