Computational Social Science Hack Day 2014

GT and Emory Computational Social Science enthusiasts again joined forces — this time for a hack day at Georgia Tech — following up on a fun and successful workshop at Emory in November. The 20 participants from the two universities pursued a variety of projects, including:

  • Structural balance theory in comic book narratives (Amish, Sandeep, and Vinay, with help from Vinodh)
  • Identifying noteworthy items in text from State-of-the-Union addresses (Tanushree and Yangfeng, with help from Jeff and Tom)
  • Combating the trafficking of minors, using text analysis and computer vision (Eric and Parisa)
  • Connecting spelling variation with sentiment analysis (Uma, Yi, and Yu)
  • Predicting Grammy award winners from Tweet volume (Jayita and Spurthi)
  • Mining OpenSecrets to find latent clusters of campaign donors and recipients (Jacob, Jon, and Munmun)

Since this is my blogpost, I’ll take a little more time to talk about my project. Essentially we just read in OpenSecrets data and built a matrix of donors and recipients. The sparse SVD of this matrix revealed some interesting factors. Here are the top 7 candidates and donors for each of the most interesting factors, along with my own name for the factors.

—– Factor 2 (unions) —–
Julia Brownley (D) American Fedn of St/Cnty/Munic Employees
Ed Markey (D) American Assn for Justice
Ann Kirkpatrick (D) Intl Brotherhood of Electrical Workers
Ann Mclane Kuster (D) International Assn of Fire Fighters
Timothy H. Bishop (D) Operating Engineers Union
Ron Barber (D) American Federation of Teachers
Cheri Bustos (D) National Assn of Letter Carriers
—– Factor 3 (insurers) —–
Kay R. Hagan (D) Metlife Inc
Max Baucus (D) American Council of Life Insurers
Ron Kind (D) Principal Life Insurance
Dave Camp (R) Massachusetts Mutual Life Insurance
Joseph Crowley (D) Morgan Stanley
Richard E. Neal (D) TIAA-CREF
Pat Toomey (R) UBS Americas
—– Factor 4 (finance) —–
Jeb Hensarling (R) Investment Co Institute
Randy Neugebauer (R) American Land Title Assn
Scott Garrett (R) Chicago Mercantile Exchange
Michael Grimm (R) Indep Insurance Agents & Brokers/America
Bill Huizenga (R) Bank of America
Sean P. Duffy (R) Securities Industry & Financial Mkt Assn
Steve Stivers (R) PricewaterhouseCoopers
—– Factor 6 (construction and industry) —–
Bill Shuster (R) American Council of Engineering Cos
Frank A. LoBiondo (R) Owner-Operator Independent Drivers Assn
Patrick Meehan (R) CSX Corp
David P Joyce (R) Carpenters & Joiners Union
Jim Gerlach (R) American Road & Transport Builders Assn
Nick Rahall (D) Norfolk Southern
Tom Petri (R) NiSource Inc
—– Factor 7 (arms manufacturers) —–
Adam Smith (D) Raytheon Co
Buck Mckeon (R) Northrop Grumman
John Cornyn (R) Lockheed Martin
Joe Wilson (R) National Assn of Realtors
John Carter (R) AT&T Inc
Michael McCaul (R) BAE Systems
Lamar Smith (R) Honeywell International
—– Factor 8 (technology) —–
Jeanne Shaheen (D) Microsoft Corp
Ed Markey (D) Every Republican is Crucial PAC
Kay R. Hagan (D) Verizon Communications
George Holding (R) National Cable & Telecommunications Assn
Chris Coons (D) Google Inc
Joe Heck (R) National Assn of Broadcasters
Ron DeSantis (R) Viacom International
—– Factor 9 (energy + Halliburton) —–
Pete Olson (R) Halliburton Co
Ed Markey (D) Koch Industries
Joe Barton (R) Independent Petroleum Assn of America
Bill Johnson (R) National Cable & Telecommunications Assn
Michael G. Fitzpatrick (R) Occidental Petroleum
Mary L. Landrieu (D) Cellular Telecom & Internet Assn
Steve Scalise (R) DTE Energy
—– Factor 11 (communications) —–
Bob Goodlatte (R) Google Inc
Kelly Ayotte (R) Clear Channel Communications
Lindsey Graham (R) Sprint Corp
Tim Scott (R) Association of American Railroads
Susan Collins (R) Union Pacific Corp
Eric Swalwell (D) Norfolk Southern
John Thune (R) Microsoft Corp

Our plan was to learn more about these factors by connecting them with text from the wikipedia pages of the companies and with the NOMINATE scores and committee memberships of the legislators. Maybe that would have been possible in a 24-hour hackathon, but we ran out of time as I was just starting to get reasonable topics for the wikipedia pages.

We went into the day with the goal of building and strengthening connections across disciplines and institutions, and by that metric I think the day was a success. In any case, I had a blast working with new people and trying out some new ideas, and I’m confident this will impact my research in the long run. It was also a lot of fun to work with my own students and colleagues in a more collaborative setting. Taking a day off from endless paper and proposal deadlines (and non-stop email distractions) to hack on a new project felt like a mini-vacation.