Projects

In partnership with government agencies and public interest groups, we work on projects spanning the spectrum of data collection, data analysis, and tool building. Below we describe some of work we’re doing at the intersection of data science and public policy.

Stanford Open Policing Project
In a large-scale analysis of nearly 100 million traffic stop records, we found that Black drivers across the country were less likely to be stopped after sunset, when a "veil of darkness" masked their race. In addition, we demonstrated that Black and Hispanic drivers were routinely searched on the basis of less evidence when compared to white drivers. Finally, we observed that marijuana legalization in Colorado and Washington state reduced searches of stopped drivers — but the bar for searching Black and Hispanic drivers in these states after legalization was still lower than for white drivers.
Equitable Speech Recognition
Automated speech recognition (ASR) systems, which use sophisticated machine-learning algorithms to convert spoken language to text, have become increasingly widespread, powering popular virtual assistants, facilitating automated closed captioning, and enabling digital dictation platforms for health care. We studied the ability of five state-of-the-art commercial ASR systems to transcribe 20 hours of structured interviews with white and Black speakers. We found that all five of these leading speech recognition tools misunderstood Black speakers twice as often as white speakers.
Blind Charging
Prosecutors have nearly absolute discretion to charge or dismiss criminal cases. There is concern, however, that these high-stakes judgements may suffer from explicit or implicit racial bias, as with many other such actions in the criminal justice system. To reduce potential bias in charging decisions, we designed a new algorithm that automatically redacts race-related information from free-text case narratives.
One Person, One Vote
Prominent officials have alleged that millions of people vote twice in presidential elections, calling into question the bedrock of democratic governance. Past investigations have found no indication of widespread voter fraud, but critics argue that it’s simply hard to detect. In the most comprehensive study of voter fraud to date, we examined over 100 million voting records for the 2012 presidential election. We found that double voting is exceedingly rare. We further found that one popular effort to prevent double voting — the Interstate Crosscheck Program — can in practice burden hundreds of legitimate voters for every double vote prevented.
Pretrial Nudges
To help ensure individuals facing criminal charges are able to appear in court, we built a mobile app and online platform to automatically deliver pretrial nudges for individuals who have upcoming court dates. To do so, we are leveraging modern methods from reinforcement learning to personalize the content of court date reminders, choosing a reminder strategy that is most effective for each recipient. We similarly plan to use data-driven strategies to identify those individuals most likely to benefit from transportation assistance, such as a free ride to court through a rideshare service.
Surveilling Surveillance
Surveillance cameras are everywhere. Governments, businesses and homeowners all use cameras to detect and potentially deter crime. But advances in facial recognition, predictive policing, and hacking make cameras an increasing threat to privacy. Using computer vision, we analyzed over 1 million street-view images to estimate the prevalence and placement of cameras in 16 cities around the world.
Debtors' Prisons
In almost every state, courts can jail those who fail to pay fines, fees, and other court debts—even those resulting from traffic violations or other non-criminal violations. While imprisoning someone for failing to pay a debt remains illegal on paper, these aggressive debt-enforcement tactics have led to the de facto reemergence of debtors’ prisons. Many believe that thousands of people across the country are jailed each year for unpaid fines and fees, but a dearth of data has made it difficult to rigorously assess and curb modern-day debt imprisonment practices.
Sentencing Enhancements
Sentencing enhancements are laws that increase the total incarceration term for a crime based on aspects of how the crime was committed or who committed it. While Three Strikes may be the most well-known enhancement law, there are dozens of such statutes in the California penal code. However, little research has been done on how they affect sentencing and incarceration, even as many suspected that sentencing enhancements contributed considerably to the overcrowding crisis in the California penal system. In partnership with the San Francisco District Attorney, we found that enhancements account for a significant proportion of time served in jail or prison—over one in four years served from felony sentences in San Francisco. Our results show that while enhancements are sentenced in a relatively small proportion of felony cases, they more than double the sentence imposed for the base crime in these cases.
Reducing Incarceration in St. Louis
In recent years, the City of St. Louis has seen a strong popular and political interest in criminal justice reform. However, the lack of quality and timely access to criminal justice data has been an obstacle to crafting intelligent policies to effect such reform. Over the past year, SCPL developed dashboards that describe the city’s current and historical jail population. This tool allows officials in the city’s Corrections Division to find immediate answers to questions that would have previously been difficult. In addition, by making this information public, the city and the public can have a more substantive conversation about corrections in St. Louis, such as closing the older jail facility in the city.
MathBot
Mathbot is a chatbot that teaches concepts, provides practice problems, and offers tailored feedback and explanations. To help personalize the experience, we used reinforcement learning algorithms to tailor the pace and flow of the conversation to the needs of each individual.