By Rob Mitchum

In the second week of the fellowship, fellows and mentors have heard from a wide range of our project partners for the summer, learning from the experts about the problems they will soon tackle. Between talks,they’ve started working with the data and brainstorming ideas about what they hope to reveal and build to meet the partners’ needs. Oh, and a handful of them might watch a little bit of the World Cup too.

We’re happy to announce more information about the first quarter of our summer’s projects. In the next few days, we’ll provide additional details about the rest, before diving deeper into each project with a series of blog posts about how the teams are exploring their data and problem.

World Bank Group – Prediction & Identification of Collusion in International Development Projects


The World Bank Group lends billions of dollars every year to fund large infrastructure projects around the globe. Project-related contracts are awarded to companies and entities via open and competitive bidding processes. Such processes can sometimes be subject to collusion and corruption risks.

Working with data on major contract awards and projects, we will help develop a model that predicts potential collusion cases. The model will look for anomalous patterns of bidding and spending and subsequently alert the organization’s Integrity Unit to take a closer look at potentially suspicious behavior.

Mentor: Eric Rozier

Fellows: Jeff Alstott, Dylan Fitzpatrick, Carlos Petricioli, Misha Teplitskiy

Chicago Public Schools – Student Enrollment Prediction for Budget Allocation


Each spring, Chicago Public Schools allocates $1.8 billion to the hundreds of public schools in its system. To determine where to distribute that money, CPS must predict next year’s enrollment for each school months ahead of time, then adjust budgets two to three weeks into the school year when the actual enrollment numbers are set. Large discrepancies between projected enrollment and the real numbers lead to large adjustments in funding, which can disrupt teachers and students.

We’re working with CPS to develop a better model that more accurately predicts next year’s enrollment for each school in the system. The project team will work with data from CPS on student, school, and staff attributes, as well as other data sources (including publicly available crime data, housing data, and economic development data) to develop a frequently-updating model that will lower the amount of money shuffled each school year, and reduce the number of schools that face major re-allocations of funding.

Mentor: Joe Walsh

Fellows: Vanessa Ko, Andrew Landgraf, Tracy Schifeling, Zhou Ye

Harris School of Public Policy, Sunlight Foundation – Text Analysis of Government Spending Bills to Understand Pork Spending


Government legislation is not designed for readability, and their volumes of text are not easily analyzed. Advocacy and research groups would like a way to digest bills quickly, filtering out the bureaucratic jargon and leaving the important details. The Sunlight Foundation is a nonpartisan nonprofit that uses technology to make governments more accountable. Their API for federal bills are valuable streams of legislative text that can be used for analysis given the right tools.

With Christopher Berry from the Harris School of Public Policy, we will develop these tools, transforming text into usable data for fast, in-depth analysis. The first use case will be spending bills; we will create a database of federal and state spending that identifies and organizes the what, where, how much, and who from legislature. Ideally, these tools will be universal, enabling organizations to search legislature for other topics as well.

Mentor: Joe Walsh

Fellows: Matthew Heston, Madian Khabsa, Vrushank Vora, Ellery Wulczyn