DARPA believes AI Cyber Challenge could upend patching as the industry knows it – CyberScoop

by Greg Otto · April 30, 2025

SAN FRANCISCO — Leaders of various federal research agencies and departments outlined a vision Tuesday for the future of critical infrastructure security, emphasizing the promise of combining formal software development methods with large language models (LLMs).

Acting DARPA Director Rob McHenry told an audience at the RSAC 2025 Conference that such a combination could “virtually eliminate software vulnerabilities” across foundational system infrastructures, a departure from the traditionally accepted risks of software flaws.

“We’ve all been trained in a world where we have to accept that there are vulnerabilities in our software, and bad guys exploit those vulnerabilities,” he said. “We try to mitigate the damage and patch them, and we go round on this merry-go-round. That technologically does not need to be true anymore.”

DARPA’s statements came in the context of the AI Cyber Challenge, a public-private collaboration involving industry leaders such as Google, Microsoft, Anthropic and OpenAI. The initiative tests whether advanced AI systems can identify and patch vulnerabilities in open-source software components vital to the electric grid, health care, and transportation.

At the semifinal round held at DEFCON last year, teams using LLMs and automated reasoning systems successfully found and patched numerous synthetic vulnerabilities in a range of open-source projects, including the Linux kernel and SQLite. According to McHenry, these results indicate not just proof-of-concept, but a potential paradigm shift in how secure software could be produced and maintained.

Formal methods — a way of using math to prove that software works as intended — have for decades been regarded as effective but laborious and expensive, suited only for the most critical systems and requiring expert staff. McHenry noted that combining LLMs with formal methods enables automatic generation and validation of correctness proofs, drastically lowering the labor and cost barriers.

Panelists from health care and transportation agencies underscored why patching speed matters. Jennifer Roberts, director of resilient systems at ARPA-H, cited industry data showing an average of 491 days between a patch becoming available and its deployment in hospitals. Vincent Tang, deputy director at ARPA-I, noted that the nation’s 300,000 traffic signals run on a patchwork of legacy firmware supplied by dozens of vendors.

Additionally, critical infrastructure — including water treatment plants, power grids, hospitals, and transportation networks — relies on software ecosystems often composed of open-source code. These ecosystems are known to harbor vulnerabilities, which actors ranging from criminal gangs to nation-states have exploited.

The panelists from research agencies recognized substantial obstacles, including regulatory barriers, liability concerns, and the challenge of installing updates on many different types of technology. However, they all viewed AI as a potentially transformative approach to cybersecurity.

McHenry said that DARPA is less concerned with incremental improvement and more with technological “offsets” that could render whole classes of attack ineffective.

“We don’t play small ball,” he said. “I’d give up 99 programs that hardly fail to have one that once a decade causes an offset in national security that kind of changes everything.”

The post DARPA believes AI Cyber Challenge could upend patching as the industry knows it appeared first on CyberScoop.

–

DARPA believes AI Cyber Challenge could upend patching as the industry knows it – CyberScoop

Leave a Reply Cancel reply

Recent Posts