By Ned Yoxall

It’s usual for those of us here at Data Science for Social Good to want to see our projects thrown into the spotlight. More publicity means a wider audience, which means a greater impact (or so the theory goes).

Yet for my own “Team Charlotte” and our close partners at “Team Nashville”, the recent events in Baton Rouge, St. Paul and Dallas have shone a stark light on what is at stake. Our task is to work with the police departments in our respective cities to predict officers at risk of an adverse interaction with the public, where an “adverse interaction” might mean an excessive use of force, a sustained citizen complaint, or an on-duty injury.

As a Brit, I can’t pretend to make much sense of the complicated maelstrom swirling around race relations and police brutality (perceived or otherwise) here in the US. One thing is certain though; debate swirling around the subject is often remarkably ill-informed. My sense is that this is largely due to a lack of data. True discrimination – or the lack thereof – can only be uncovered once we know the circumstances surrounding each event.

It was against this backdrop that both the Charlotte and Nashville teams visited our departments, and while most of our time was spent in meetings discussing which features might best predict at-risk officers, or asking technical questions about how to link particular tables together, the standout highlight for all of us was a “ride-along” with a police officer.


Over the course of five hours, I accompanied an officer who had to deal with a host of calls: comforting a mother recently attacked by her armed son, rushing with sirens blaring to the aid of the fire brigade, and placating a 9-year-old boy runaway attempting to escape his mother to name a few. One traffic stop was particularly jarring for me: I asked my officer why he had touched the back of the van he pulled over; the reply was that he wanted to leave his fingerprints on the vehicle in case the driver were to shoot him.

It’s rare in data science to be given such direct access to the people and actions recorded in the data. Not only did this give us a chance to ask questions about data collection and recording; it also gave us an unparalleled opportunity to speak to the officers on the ground. After all, these are the people who know better than anyone the factors that could lead to an adverse incident. We learnt, for example, that answering multiple suicide calls in a short period of time can take a heavy psychological toll, as can the collapse of family life following a divorce, or the death of a parent.

These examples may sound like statements of the excruciatingly obvious to you. But without my experience in that passenger’s seat I would have found it all too easy to treat an officer like a robot, completely unaffected by the context of their life and circumstance when making split-second decisions in the line of duty. Without these ride-alongs all I had was a stereotype of police officers derived from news media, television, and movies.

I was also reminded that such expressions of human nature are not limited to the officers on the front line. Supervisors who judge whether an incident is deemed adverse can also err; their judgement will not be perfect 100% of the time. Putting my data science hat back on, this means that we need to apply a healthy dose of skepticism to our data; real life is messy, relational databases are not.

All of this leads me to believe that the difference between what gets recorded in a database and the nuance of what happens on the ground should be accepted in the wider discussion about the relationship between the police and the public. Within the police force there are as many opinions about how things should be run as there are officers, and a binary and moralistic view of “the police” being “right” or “wrong” is too often a gross over-simplification.

And so we need to bring this insight that “the police” can not be treated as one homogeneous group to our work. When building our model predicting officers at risk of an adverse incident, we’ll be looking for ways to include as many contextual features as possible so that we capture as much nuance as possible. Our hope is that this will allow us to create a system that will make both the police and the public safer. It may also restore a little trust in the police force, which, in turn, would be a vital first step along the road to reconciliation.