Heroic Feats of Data Scientists Doing Good

February 6, 2013 Paul M. Davis

Photo by Sean MacEntee via Flickr. (CC BY 2.0)

Data scientists are a tireless lot: between using their multidisciplinary skills to solve critical business challenges, fielding job offers, and competing in Kaggle competitions, many practitioners moonlight in service of the public good. These heroic data divers serve a critical need, devoting nights and weekends to munging messy XML files, developing models, and unearthing insights for non-profits and advocacy groups that lack these skills or resources. Such organizations have a plentitude of both internal and public data at their disposal; revealing insight from all that information, however, is a whole different matter.

As demonstrated by the volunteers who attend DataKind’s weekend-long DataDives and join the organization’s DataCorps, there are plenty of practitioners eager to devote their skills to serving the public good. But what possesses busy data science professionals to volunteer a free night or weekend for the modest wages of free pizza and beer?

In a recent series of blog posts, Datakind turns that question back to the “Data Heroes” who have done just that. One of those folks is Mike Stringer, managing partner at DataScope Analytics in Chicago. His volunteer team’s work with the Chicago Red Cross to help the organization direct fire prevention efforts to high-risk areas resulted in an interactive map and a discussion on how the Red Cross can streamline its data policies to perform real-time analysis on response efforts.

Mike Stringer presents at the DataKind Chicago DataDive. Photo by Paul M. Davis, via Shareable Magazine.

In the interview, Stringer notes, “Everyday I witness the unique ability for data…to be used as a valuable resource for solving important problems. Getting involved in helping non-profits use data as a resource to solve important societal problems feels good! Maybe even more gratifying is that it helps build and train a community that is passionate and capable of cultivating and communicating a fact-based view of their surroundings.” The sentiment is echoed by Sisi Wei, a data journalist at the Washington Post, who moonlights as a data ambassador for D.C. Action for Children. “I make time to have a positive impact on issues that matter to me,” she says. “It’s what I do during my day job. Why not do it in my spare time?”

DataKind gets plenty of deserved attention here on Datastream, but they’re not the only data-centric organization focused on social causes. Statistics Without Borders (SWB), an offshoot of the American Statistical Association, focuses on bringing the rigor and insight of statistical analysis to international health relief projects. Boasting almost 200 volunteers from academia, business, and government, the organization aided with survey design and analysis of data on the effects of the 2010 earthquake that ravaged Haiti. In such crises, quick response is essential, SWB emphasizes. “In the aftermath of this (or any other) natural disaster,” they note, “it is critical to quickly develop reliable estimates of the extent of damage to homes and displacement of people as well as the nature of the displacements.”

Gun deaths in the U.S. in 2010 visualization, by Periscopic.

Other outfits more explicitly serve an advocacy role. Periscopic is a data analytics shop that produces strikingly innovative client work as well as provocative in-house projects that address pressing social issues. The company’s charter is made clear in its tagline — “Do good with data”. Periscopic’s latest work, which visualizes the gun deaths in the U.S. in 2010, is its most powerful, and sobering, to date. As project leads Kim Rees and Dino Citraro explain in an interview with Co.Design, their goal was not to visualize this data in aggregate, but instead emphasize the human toll of each death. “We’re hoping that people will see these individual victims,” they state in the interview. “We’re not looking at aggregate numbers. We’re not trying to analyze this data. This data was living and breathing, and has now been extinguished.” Beginning with an animation of single murders, increasing in frequency and speed to reach a horrible crescendo of 9,595 murders during the year, the visualization has a powerful emotional and moral weight.

From the Mapping America exhibition by the Civic Data Design Lab.

The range and impact of data for good and advocacy efforts will only grow as institutional support increases. A number of such endorsements have recently emerged in the open data and civic tech space, such as the announcement of the Civic Data Design Lab at MIT and the forthcoming Knight News Challenge focused on open government, which launches on February 12th. The most high-profile yet is the White House Open Data Day Hackathon scheduled for February 22, which invites developers and data hackers to come to the White House to develop APIs and visualizations on open government data. Invitations will be announced this Friday, February 8th, no doubt an honor for those chosen. But the numerous Data Heroes out there aren’t doing it for the accolades; instead, they’re devoting their time to social causes and organizations that greatly benefit from their unique and in-demand skills.

About the Author


A Stately Resque
A Stately Resque

Helps ActiveRecord save race condition in Resque We have a Rails app that is saving a new ActiveRecord obje...

Running tests with Zeus in RubyMine
Running tests with Zeus in RubyMine

(Adapted from a thread on the RubyMine support forums) Zeus is a self-described “language-agnostic applicat...