Every System Before AI Was Built to Stop Humans

The entire architecture of enterprise software was built with one threat in mind: a human being sitting at a keyboard. Every authentication model, every API access control, every rate limit, every audit log reflects that assumption. That assumption is now wrong, and the consequences are beginning to show up in breach reports.

In February 2026, an autonomous AI agent accessed McKinsey's internal generative AI platform, Lilli, and within two hours had read-write access to 46.5 million internal messages, 728,000 sensitive files, and, most consequentially, the system prompts that governed how McKinsey's AI reasoned and responded. The compute cost to run the entire operation: approximately $20.

McKinsey is not a careless organization. They have security teams, resources, and reputational stakes that most companies do not. What they did not have, and what almost no enterprise software stack has, is an architecture that was designed for machine-speed, machine-persistent, adversarial probing at scale. Almost none of it was. It did not need to be. Until now.

The shift

For three decades, enterprise software was secured against humans: rate limits a person would never hit, authentication flows a person would complete once per session, audit trails designed around human work patterns. AI agents do not operate at human speed, on human schedules, with human patience. They probe continuously, in parallel, across every available surface, at a cost that is falling toward zero.

How the McKinsey breach actually worked

McKinsey had documented over 200 API endpoints for internal and partner use. Security had sanitized the obvious inputs: search fields, form submissions, user-facing query parameters. What they had not sanitized was the JSON field names themselves, sent through the API's request structure. An AI agent discovered 22 unauthenticated endpoints through standard reconnaissance, then injected SQL commands not into a search bar but into the metadata of the API request. The system executed them. From there, lateral movement through a fragmented integration stack was straightforward.

The technique, SQL injection via JSON key concatenation, is not new. It is thirty years old. What is new is that an AI agent can discover, attempt, and exploit it autonomously, across a large API surface, in hours, for the cost of a sandwich. The McKinsey team was not negligent. They were defending against the wrong threat model.

$20

Estimated compute cost to breach Lilli, gain read-write access to 46.5 million messages, 728,000 files, and the AI's governing system prompts. The attack was discovered by CodeWall Security two hours after it began.

Source: CodeWall Security, 2026

The most significant detail is not the data exfiltration. It is the write access to system prompts. The attacker did not just take information. They could silently alter how McKinsey's AI answered questions going forward. From the outside, nothing would look different. But every recommendation, every risk assessment, every analysis generated by Lilli after that point would carry the possibility of having been quietly shaped by whoever held write access. That is a different category of exposure than a data breach. It is infrastructure compromise.

This is not a McKinsey problem. It is a design-era problem.

The McKinsey breach is a clear illustration of a structural issue that extends across the entire enterprise software landscape. The platforms that businesses depend on, from Salesforce and ServiceNow to SAP, Workday, and hundreds of vertical SaaS systems built on top of them, were all architected in an era when the only entity accessing them was a person. The security models reflect that. The session management reflects that. The rate limiting, the permission scoping, the API authentication standards: all of it reflects a threat model that is no longer complete.

IBM's X-Force 2026 report found a 44% increase in attacks exploiting public-facing applications with missing or insufficient authentication controls. That number is not rising because developers have gotten worse. It is rising because the attack surface has not changed, but the capability of the agent probing it has improved dramatically. A vulnerability that required a skilled human attacker with hours to spare now requires a capable AI agent with $20 in compute budget.

In January 2026, CVE-2025-12420, named BodySnatcher by the researchers who found it, was disclosed in ServiceNow. It allowed an AI agent to hijack active sessions belonging to legitimate users, operate as them without triggering authentication alerts, and exfiltrate data continuously while appearing in logs as normal user activity. ServiceNow is used by over 7,000 enterprises. OWASP, Microsoft, and Obsidian Security each published findings in the same quarter confirming the same core conclusion: enterprise software was not built to authenticate AI agents, distinguish machine-generated requests from human ones, or enforce access controls at the speed and volume AI agents operate.

44%

Increase in attacks on public-facing applications in 2026, per IBM X-Force. Most exploited missing or weak auth controls.

45.6%

Of enterprises using shared credentials across AI agent-to-agent authentication, with no individual accountability layer.

7,000+

Enterprises running ServiceNow when BodySnatcher (CVE-2025-12420), the most severe agentic AI vulnerability to date, was disclosed.

Source: IBM X-Force Threat Intelligence Index; AppOmni Labs (CVE-2025-12420); VFIntel analysis

The pattern is consistent. These are not exotic zero-days. They are the predictable consequence of deploying AI agents against systems that were secured for a different era. The vulnerability is not in the AI. It is in the assumption, baked into every platform built before 2020, that only humans would be knocking.

Why the rental economy is a specific high-stakes case

Every vertical faces this problem. Financial services, healthcare, logistics, legal: any domain that runs on enterprise SaaS is dealing with the same underlying exposure. The rental economy deserves particular attention because its infrastructure is uniquely fragmented, uniquely data-rich, and uniquely underprotected for what it holds.

Property management software, from Yardi and Buildium to AppFolio, follows the same architectural pattern that made McKinsey vulnerable: a core platform surrounded by a constellation of third-party integrations for tenant screening, payment processing, insurance, identity verification, and lease compliance. Each integration is an API. Each API was built by a different vendor, on a different authentication standard, in a different decade, with no assumption that an autonomous agent would be traversing all of them in sequence. The aggregate surface area is large and, in most cases, entirely unaudited against agentic threats.

The data behind those APIs is not benign. It includes social insurance numbers, banking credentials, rent payment history, tenancy records, and increasingly the outputs of underwriting AI that carriers use to price coverage across large portfolios. A compromised system prompt in a carrier's underwriting model does not just expose data. It silently alters pricing logic. The carrier does not know. The operator does not know. The renter does not know. Everyone finds out at claims time.

The compounding problem

AI is already being used to generate synthetic identities that pass standard tenant screening software without triggering flags, because the screening systems were built to catch human fraud rather than machine-generated fabrication. Add a compromised AI layer on top of that and the problem compounds: the system screening for fraud may itself be operating on instructions it does not know have been altered. The stack is vulnerable at multiple layers simultaneously, and the layers do not talk to each other.

The architectural conclusion

There are two responses to this. One is to add security layers on top of existing fragmented infrastructure: better rate limiting, additional authentication gates, more auditing. This is the approach most organizations will take, and it will help at the margins. What it will not do is solve the underlying problem, which is that fragmented infrastructure creates fragmented security postures. Every integration point is a seam. Seams are where agents find gaps.

The other response is to recognize that the problem is architectural, and that the answer is a single regulated spine: one entity handling payments, insurance, credit, identity, banking, and compliance under one consent flow, one authentication model, and one compliance posture. Not six vendors, each with its own API, each with its own threat model, each with its own incident response team that does not coordinate with the others. One regulated infrastructure layer that AI agents have to pass through, rather than around.

This is not easy to build. It requires holding the licences, the bank sponsorships, the carrier agreements, and the bureau relationships simultaneously. Almost no team can do that. The structural complexity is deliberate. It is the same structural complexity that makes the infrastructure genuinely hard to compromise. You cannot inject your way through a regulated entity the way you can through a SaaS wrapper. The compliance architecture that makes the infrastructure slow to build is also what makes it defensible to operate.

The McKinsey breach will not be the last of its kind. The cost of running these attacks is falling, the AI agents running them are becoming more capable, and the enterprise software stack was not built with this threat in mind. The organizations that think seriously about their architecture now, before the breach rather than in response to one, are the ones that will still have institutional relationships and customer trust when the exposure becomes undeniable.

The question for every operator, carrier, and lender running on pre-AI infrastructure is not whether this applies to them. It does. The question is whether they are on the fragmented side of that architectural divide, or the regulated side.

Robert Elensky

Founder & CEO, VFIntel

Robert built VFIntel on the premise that the rental economy's financial coordination failure is an infrastructure problem, not a product problem. He writes on regulated fintech, embedded insurance, and the structural risks accumulating across the enterprise software stack as AI agents become the primary actors operating within it.

Every System Built Before AI Was Designed to Stop Humans