Data Visualization: Kaggle Contest Entry for SeeClickFix.com – Community Activity by Zipcode

Just wanted to make a quick post highlighting my last minute entry for the visualization portion of the Kaggle SeeClickFix.com competition.

Overview of my entry:

My visualization entry consists of a dashboard summary and a time-series video showing the SeeClickFix.com community activity, both historic and forecasted, by zip code, as determined using reverse lookup of the issue’s longitude and latitude.  Activity level is measured by average community votes that an issue receives. The models are based on both the training (01/2012-04/2013) and test (05/2013-09/2013) data sets. The average votes for issues in the test data (05/2013 -09/2013) have been populated using our team’s #2 ranked prediction model.

Average votes was the chosen metric for community activity levels because it is stable over time with a lower variance, and it best represents a community’s interest in fixing the issues. Views and comments are noisy with very high variance and are possibly influenced by users outside of a community.  Also city_intiated and remote_api_created issues were filtered from the data set to make comparison across cities more reasonable.

The dashboard summary encapsulates all the data into an aggregate activity level for each zip code in each city and is visualized using a combination of heat mapping for each city and a tree-view of overall most active zip codes.   Fully interactive dashboard summary available here:  http://public.tableausoftware.com/views/Kaggle-SeeClickFix-ActivityByZipcode/Dashboard1?:embed=y&:display_count=no#1

And here is a non-interactive snapshot of the summary:

Activity By Zip-Summary

The video consists of quarterly time-series heat maps of each city using the same scale as the dashboard summary. In contrast to the dashboard aggregate summary, the time series model illustrates changing community activity levels over time, including the forecasted activity levels for Q2 and Q3 2013.  Video available on YouTube : https://www.youtube.com/watch?v=DlE2uMZ44QQ

)

 

 

3 thoughts on “Data Visualization: Kaggle Contest Entry for SeeClickFix.com – Community Activity by Zipcode

  1. Nice, what tools did you use to create the time-series video? It appears you used Tableau, but I could never figure out how to create an exportable video from Tableau using the time-series function. Teach me your ways!

  2. Hi James, I used screen capture software (Bandicam) to record the time-series from within Tableau. From there I added a few nice effects using a video editing program.

    Unfortunately at this time there are no direct exports from Tableau to create video without using screen recording software. Kind of annoying, no doubt.

  3. Ahh so that’s the trick, I’ll have to remember that for the future. Thanks and nice work! Congrats on achieving 2nd place in the main competition by the way, quite a feat!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>