In the 2002 movie Minority Report (based on a short story by Philip K Dick), director Steven Spielberg imagined a future in which three psychics can “see” murders before they happen. Their clairvoyance allows Tom Cruise and his “Precrime” police force to avert nearly all potential homicides.
Twenty years on, in the real world, scientists and law enforcement agencies are using data mining and machine learning to mimic those psychics. Such “predictive policing”, as it is called, is based on the fact that many crimes—and criminals—have detectable patterns.
Predictive policing has enjoyed some successes. In a case study in the US, one police department was able to reduce gun incidents by 47% over the typically gun-happy New Year’s Eve. Manchester police in the UK were similarly able to predict and reduce robberies, burglaries and thefts from motor vehicles by double digits in the first 10 weeks of rolling out predictive measures.
Predictive policing has improved in leaps and bounds. In the past, humans had to manually pore over crime reports or filter through national crime databases. Now, in the age of big data, data mining and powerful computers, that process can be automated.
But merely finding information is not enough to deter crime. The data needs to be analyzed to detect underlying patterns and relationships. Scientists deploy algorithms and mathematical models such as machine learning, which imitates the way humans learn, to extract useful information and insights from existing data.
Recently, we turned to a mathematical method conceived in the 18th century to refine our approach. By tweaking an existing algorithm based on this method, we significantly improved its crime prediction rates.
This finding holds promise for applying predictive policing in under-resourced contexts like South Africa. This could help reduce crime levels—some of the highest in the world and rising. It’s a situation the country’s police force seems ill-equipped to curb.
Marrying two different approaches
Thomas Bayes was a British mathematician. His famed Bayes’ theorem essentially describes the probability of an event occurring based on some prior knowledge of conditions that may be related to that event. Today, Bayesian analysis is commonplace in fields as diverse as artificial intelligence, astrophysics, finance, gambling and weather forecasting. We fine-tuned the Naïve Bayes algorithm and put it to the test as a crime predictor.
Bayesian analysis can use probability statements to answer research questions about unknown parameters of statistical models. For example, what is the probability that a suspect accused of a crime is guilty? But going deeper—like calculating how poker cards may unfold, or how humans (especially humans with criminal intent) will act—requires increasingly sophisticated technologies and algorithms.
Our research built on the Naïve Bayes algorithm or classifier, a popular supervised machine learning algorithm, for crime prediction.
Naïve Bayes starts on the premise that features—the variables that serve as input—are conditionally independent, meaning that the presence of one feature does not affect the others.
We fine-tuned the Naïve Bayes algorithm by marrying it with another algorithm known as Recursive Feature Elimination. This tool assists in selecting the more significant features in a dataset and removing the weaker ones, with the objective of improving the results.
We then applied our finessed algorithm to a popular experimental dataset extracted from the Chicago Police Department’s CLEAR (Citizen Law Enforcement Analysis and Reporting) system, which has been used to predict and reduce crime in that American city. That dataset has been applied globally because of the rich data it contains: it provides incident-level crime data, registered offenders, community concerns, and locations of police stations in the city.
We compared the results of our enhanced Naïve Bayes against that of the original Naïve Bayes, as well as against other predictive algorithms such as Random Forests and Extremely Randomized Trees (algorithms we have also worked on for crime prediction). We found that we could improve on the predictions of the Naïve Bayes by about 30%, and could either match or improve on the predictions of the other algorithms.
Data and bias
While our model holds promise, there’s one element that’s sorely lacking in applying it to South African contexts: data. As the Chicago CLEAR system illustrates, predictive models work best when you have lots of relevant data to work with. But South Africa’s police force has historically been very tight-fisted with its data, perhaps due to confidentiality issues. I ran into this problem in my doctoral research on detecting and mapping crime series.
This is slowly shifting. We are currently running a small case study in Bellville, a suburb about 20km from Cape Town’s central business district and the area in which our university is located, using the South African Police Service data for predictive policing.
None of this is to suggest that predictive policing alone will solve South Africa’s crime problem. Predictive algorithms and policing are not without their flaws. Even the psychics in Minority Report, it turned out, were not error-free. Fears that these algorithms may simply reinforce racial biases, for instance, have been raised both in South Africa and elsewhere.
But we believe that, with continuous technological improvement, predictive policing could play an important role in bolstering the police’s responsiveness and may be a small step towards improving public confidence in the police.