In response to WIRED’s Freedom of Information request, the TfL says it used existing CCTV images, AI algorithms, and “numerous detection models” to detect patterns of behavior. “By providing station staff with insights and notifications on customer movement and behaviour they will hopefully be able to respond to any situations more quickly,” the response says. It also says the trial has provided insight into fare evasion that will “assist us in our future approaches and interventions,” and the data gathered is in line with its data policies.
In a statement sent after publication of this article, Mandy McGregor, TfL’s head of policy and community safety, says the trial results are continuing to be analyzed and adds, “there was no evidence of bias” in the data collected from the trial. During the trial, McGregor says, there were no signs in place at the station that mentioned the tests of AI surveillance tools.
“We are currently considering the design and scope of a second phase of the trial. No other decisions have been taken about expanding the use of this technology, either to further stations or adding capability.” McGregor says. “Any wider roll out of the technology beyond a pilot would be dependent on a full consultation with local communities and other relevant stakeholders, including experts in the field.”
Computer vision systems, such as those used in the test, work by trying to detect objects and people in images and videos. During the London trial, algorithms trained to detect certain behaviors or movements were combined with images from the Underground station’s 20-year-old CCTV cameras—analyzing imagery every tenth of a second. When the system detected one of 11 behaviors or events identified as problematic, it would issue an alert to station staff’s iPads or a computer. TfL staff received 19,000 alerts to potentially act on and a further 25,000 kept for analytics purposes, the documents say.
The categories the system tried to identify were: crowd movement, unauthorized access, safeguarding, mobility assistance, crime and antisocial behavior, person on the tracks, injured or unwell people, hazards such as litter or wet floors, unattended items, stranded customers, and fare evasion. Each has multiple subcategories.
Daniel Leufer, a senior policy analyst at digital rights group Access Now, says whenever he sees any system doing this kind of monitoring, the first thing he looks for is whether it is attempting to pick out aggression or crime. “Cameras will do this by identifying the body language and behavior,” he says. “What kind of a data set are you going to have to train something on that?”
The TfL report on the trial says it “wanted to include acts of aggression” but found it was “unable to successfully detect” them. It adds that there was a lack of training data—other reasons for not including acts of aggression were blacked out. Instead, the system issued an alert when someone raised their arms, described as a “common behaviour linked to acts of aggression” in the documents.
“The training data is always insufficient because these things are arguably too complex and nuanced to be captured properly in data sets with the necessary nuances,” Leufer says, noting it is positive that TfL acknowledged it did not have enough training data. “I’m extremely skeptical about whether machine-learning systems can be used to reliably detect aggression in a way that isn’t simply replicating existing societal biases about what type of behavior is acceptable in public spaces.” There were a total of 66 alerts for aggressive behavior, including testing data, according to the documents WIRED received.