“I don’t think an AI model should have that kind of power over people’s lives,” Pegram says.

Photograph: Alice Zoo

In response to WIRED’s public records requests, Avon and Somerset Police provided a huge trove of performance data for 13 risk models used between 2017 and 2024—including those used to predict missing people, antisocial behavior, and who was most likely to commit or fall victim to crime. WIRED passed this data, along with other contextual information about Avon and Somerset Police’s data science program, to the independent AI auditing firm Eticas for review. The verdict was damning.

“Most of these models produce low precision scores, meaning a high proportion of the individuals they flag as risks are incorrectly identified,” the data review found. A model used to help predict burglars appeared to operate with a precision rating lower than 10 percent for more than three years, according to the police data. According to Eticas, that meant fewer than one in 10 flagged as high risk would actually offend. Other concerns included performance metrics for various models shifting sharply. “This is not typical of well-governed models in operational use,” the audit observed.

A spokesperson for the Avon and Somerset Police told WIRED that the force chose not to deploy some of the models it developed, including the one relating to burglaries. When asked why the force had years of audit and performance data for models it did not use, the spokesperson said the audit process was “automated” and used data from a “static file which was not deleted when the decision was made not to deploy the model.”

The police force declined requests for interviews about its data science work and did not respond fully to a detailed list of questions. “Each model is scored based on its performance, and where issues are identified, they will be updated or turned off,” the Avon and Somerset Police spokesperson said in a statement, adding that models are reviewed by a police subject expert before they are deployed.

It’s not clear what steps Avon and Somerset Police took to address the risks raised by its own ethics committee in the early days of its data science work. The committee did not appear to discuss predictive analytics again after 2017, according to records request disclosures. And while Avon and Somerset Police says on its website that “each product and project” pursued as part of its data science work is reviewed by a dedicated ethics group, the spokesperson told WIRED “there has so far been no meeting held,” because “no model has been produced for which potential ethical issues have been identified.”

In response to one public records request, Avon and Somerset Police supplied a screenshot of a “bias check app” that appeared to monitor and compare average risk scores for white individuals and people of color, concluding there was “no significant difference between the two.” The Eticas review said: “Simply including ethnicity as a monitoring variable is not equivalent to testing whether the model produces discriminatory outcomes,” describing the absence of more detailed testing by ethnicity, gender, and socioeconomic status as “a significant omission.”

Asked whether he believes predictive analytics has a role to play in policing or social work, Davies says further work is needed. “When we were trying to do it, we were trying to do it for the right reasons, in the right way, but we didn’t have the capacity that it probably needed.” Part of that work should look at how risk models can inform workers without nudging them into foregone conclusions, he says. “There is a risk that staff see the computer say something and then don’t use their own judgment.”

Predictive analytics continues to play a significant role in policing and public services in the region. Bristol City Council still uses a risk-scoring model to assess the likelihood of a child falling out of education, employment, or training. Avon and Somerset Police’s latest audit data, provided in July last year, indicates that the model used by the Offender Management App correctly predicts just one in three people who actually offend, while one in four people flagged as likely offenders do not.

Share.
Exit mobile version