Forensics tool ‘reanimates’ the ‘brains’ of AIs that fail in order to understand what went wrong

From drones delivering medical supplies to digital assistants performing everyday tasks, AI-powered systems are becoming increasingly embedded in everyday life. The creators of these innovations promise transformative benefits. For some people, mainstream applications such as ChatGPT and Claude can seem like magic. But these systems are not magical, nor are they foolproof – they can and do regularly fail to work as intended.

AI systems can malfunction due to technical design flaws or biased training data. They can also suffer from vulnerabilities in their code, which can be exploited by malicious hackers. Isolating the cause of an AI failure is imperative for fixing the system.

But AI systems are typically opaque, even to their creators. The challenge is how to investigate AI systems after they fail or fall victim to attack. There are techniques for inspecting AI systems, but they require access to the AI system’s internal data. This access is not guaranteed, especially to forensic investigators called in to determine the cause of a proprietary AI system failure, making investigation impossible.

We are computer scientists who study digital forensics. Our team at the Georgia Institute of Technology has built a system, AI Psychiatry, or AIP, that can recreate the scenario in which an AI failed in order to determine what went wrong. The system addresses the challenges of AI forensics by recovering and “reanimating” a suspect AI model so it can be systematically tested.

Uncertainty of AI

Imagine a self-driving car veers off the road for no easily discernible reason and then crashes. Logs and sensor data might suggest that a faulty camera caused the AI to misinterpret a road sign as a command to swerve. After a mission-critical failure such as an autonomous vehicle crash, investigators need to determine exactly what caused the error.

Was the crash triggered by a malicious attack on the AI? In this hypothetical case, the camera’s faultiness could be the result of a security vulnerability or bug in its software that was exploited by a hacker. If investigators find such a vulnerability, they have to determine whether that caused the crash. But making that determination is no small feat.

Although there are forensic methods for recovering some evidence from failures of drones, autonomous vehicles and other so-called cyber-physical systems, none can capture the clues required to fully investigate the AI in that system. Advanced AIs can even update their decision-making – and consequently the clues – continuously, making it impossible to investigate the most up-to-date models with existing methods.