Vijay Raina has spent the better part of a decade at the front lines of data protection, engineering the complex tools that guard the world’s most sensitive information. As an expert in SaaS and enterprise software architecture, he has witnessed the evolution of security from the era of simple firewalls to the current age of autonomous agents and large language models. This conversation focuses on the seismic shift occurring in the threat landscape, where the speed of vulnerability discovery is rapidly outpacing the ability of organizations to patch their systems. Raina explains why traditional data loss prevention strategies are failing in an environment where machine identities now vastly outnumber human users and why the “shadow agent” problem is the new frontier for data security teams. The discussion moves through the industrialization of bug discovery, the critical role of non-human identity management, and the necessity of integrating AI-driven security controls directly into the software development lifecycle.
Automated discovery tools recently flagged hundreds of vulnerabilities in major software projects, yet patching programs are struggling to keep pace; how has the fundamental math of vulnerability management changed over the last year?
The math has shifted because the cost of discovery has dropped by an order of magnitude, while the cost of remediation remains tied to human labor and organizational friction. Look at what happened with Firefox 150 in April 2026. Mozilla was able to ship two hundred and seventy-one fixes in a single sweep using a preview version of Anthropic’s Mythos model, which is roughly four times their typical annual baseline in just one pass. This isn’t necessarily because the AI is finding “new” types of bugs that were previously unimaginable, but because it can find the existing classes of bugs at a scale and speed that humans simply cannot match. When Mozilla noted that an elite human researcher could have found any of those bugs given enough time, they were highlighting the most important shift: the automation of elite-level analysis. For an enterprise, this means the window between a vulnerability being discovered and it being exploited is shrinking to a point where traditional, human-led patching cycles are becoming obsolete.
If the price of finding bugs has dropped so significantly, why isn’t the software community able to simply automate the fix as well?
We are seeing a massive bottleneck because the upstream open-source ecosystem is essentially choking on the volume of discovery. On March 27, 2026, HackerOne actually paused new submissions to its Internet Bug Bounty program, which is the oldest crowdsourced reward program for open source. This wasn’t a financial decision; it was a white flag. The gap between AI-assisted discovery and the ability of volunteer maintainers to actually write, test, and ship patches has become an unbridgeable chasm. Earlier that same year, the curl project had to remove bounties for similar reasons because they were being buried under a mountain of low-quality, AI-generated submissions that overwhelmed their ability to triage. Most enterprises are downstream of this struggle because they rely so heavily on open-source dependencies, meaning their security is now tethered to a system that is fundamentally overwhelmed.
You’ve mentioned that this shift matters more for data security than any other function; why is the data layer specifically at such high risk right now?
In most security conversations, people get bogged down in the technicalities of the exploits themselves, but for a data security architect, the only thing that matters is the asset at the end of the path. When the average exposure time of a vulnerability shortens, the probability that one of your internal controls will fail within that window increases exponentially. We are seeing a trend where security incidents involve autonomous agents that exceed their intended permissions. A Cloud Security Alliance study from April 2026 found that fifty-three percent of organizations have already dealt with agents overstepping their bounds, and forty-seven percent had a security incident involving an agent in just the past year. These incidents don’t usually result in service outages; they result in data exposure. The blast radius is determined by three things: what data the agent could reach, which policies were actually being monitored, and how fast the team noticed. Currently, none of those three areas are improving fast enough to counter the speed of the adversaries.
With machine identities now outnumbering human users by such a massive margin, how should enterprises rethink their identity and access management strategies?
We have to stop treating non-human identities as a simple hygiene problem and start treating them as the primary attack surface. CyberArk’s 2025 landscape study showed that machine identities now outnumber humans by more than 80 to 1, which is a staggering jump from the 45 to 1 ratio we saw in 2024. Most of our existing IAM infrastructure was built for humans—things like quarterly attestation cycles and manager sign-offs—and that simply doesn’t scale for a population that is eighty times larger and capable of provisioning itself. The GitGuardian report from 2025 found 23.8 million new secrets exposed on public GitHub in 2024 alone, which is a twenty-five percent increase year-over-year. These secrets are the keys that allow non-human identities to bypass our DLP policies, often through service accounts or OAuth grants that were created for a temporary test and then forgotten. We need a formal decommissioning process, because right now, only twenty-one percent of organizations actually have one for AI agents, leaving the rest buried in what I call “retirement debt.”
How does the rise of agentic workflows change the way we approach data classification and tagging?
The old way of tagging documents for email or SaaS applications is breaking because AI agents and copilots don’t flow through those traditional choke points. An agent might read a large corpus of data, generate a completely new derivative artifact, and then write that artifact to a location that your DLP doesn’t even inspect. In this process, the original security tag rarely survives the round trip. We need to start treating AI-generated outputs as a first-class data category and lower the threshold for tagging at the point of ingestion. If you aren’t aggressive about classification at the source, you are essentially flying blind once an agent starts reassembling sensitive content into new forms. Enterprises need to audit their DLP coverage specifically for LLM endpoints and agentic SaaS surfaces, because that is where the most significant coverage gaps exist today.
You suggest putting a model-based reviewer directly into the pull request path; how does this move the needle for defensive teams?
This is one of the few areas where the technology actually works in the defender’s favor if you implement it correctly. Traditional security scanning queues are where productivity goes to die, filled with thousands of noisy, unverified findings that humans have to manually check. By putting an automated reviewer like OpenAI’s Codex Security agent—which has already contributed to over 3,000 critical and high-severity fixes—directly into the pull request path, you can catch defects before they ever reach a human’s eyes. High-confidence findings can even block a merge automatically. However, there is a nuance here that teams often miss: standard commercial models often refuse security-related queries due to their safety policies. This is why programs like Anthropic’s Glasswing or OpenAI’s Trusted Access for Cyber, which expanded to thousands of defenders on April 14, 2026, are so critical. They provide a lower refusal threshold for verified defensive work, but you have to get the legal and procurement paperwork done now, before you’re in the middle of a crisis.
The supply chain seems to be a recurring nightmare for software teams; what did we learn from the npm incidents of late 2025?
The September 2025 incidents were a massive wake-up call regarding how quickly a build environment can become a data exfiltration pipeline. On September 8, eighteen widely used npm packages were trojanized, impacting projects that collectively account for over 2.6 billion weekly downloads. Just a week later, we saw the Shai-Hulud worm, which was the first self-propagating malware in the npm ecosystem. This worm was particularly nasty because it integrated tools like TruffleHog to scan for secrets and harvested credentials from cloud metadata services. The attacker’s goal wasn’t just to break the software; it was to find credentials that could be used to exfiltrate sensitive data. We have to treat the build pipeline as a primary data security boundary. Using a dependency firewall to validate the provenance of every package before it is installed is probably the highest-leverage move a team can make to close this specific attack surface.
What is the “shadow agent” problem, and why is it proving to be more dangerous than the shadow IT or shadow SaaS issues of the past?
The shadow agent problem is fundamentally a visibility crisis. The April 2026 CSA research showed that eighty-two percent of organizations have discovered unknown agents in their environments, with forty-one percent finding them more than once. These agents pop up in internal automation, custom plugins, or even developer-created workflows. The danger is that, unlike a shadow SaaS app which might just hold one type of data, an agent is only useful if it has broad context. We often see reconciliation agents or support copilots that have been given overprivileged service accounts with access to entire ticket databases or financial records just to make them work more “conveniently.” This isn’t malicious, but it creates a massive, unmapped data exposure risk. You can’t apply a policy to something you can’t see, which is why an agent registry—even if it’s just a simple spreadsheet that tracks who owns the agent and what data it can read—is a mandatory first step.
For a security leader looking to make an impact this week, what are the most critical starting moves to stabilize their data protection program?
Don’t try to build a perfect ninety-day plan; instead, focus on four or five moves that you can ship in the next two weeks. First, run a basic inventory pass to find out which AI agents and copilots are actually running in your environment. Second, look at the data scope of every service account or OAuth grant tied to those platforms and tighten the ones that were built for convenience rather than security. Third, pilot a model-based reviewer in the pull request path of just one codebase to see how it affects your developers and your false-positive rates. Finally, start the verification process with your AI vendors so you can access defensive-tier models. Also, add a simple “generated by AI” tag to your classification taxonomy immediately. These aren’t just administrative tasks; they are the structural adjustments needed to ensure your program survives the shift in the threat economy.
What is your forecast for the future of enterprise data security over the next two years?
The next twenty-four months will be defined by a Great Sorting of security programs into two very distinct camps. The first camp will consist of organizations that recognize the fundamental shift in the cost of discovery and respond by industrializing their defenses, moving toward automated non-human identity management and model-driven application security. These teams will spend 2026 and 2027 refining their controls and integrating them into a unified, agent-aware stack. The second camp will be the organizations that treat AI as just another “category” of risk to be managed through traditional, human-centric processes. These teams will unfortunately spend that same window reacting to a relentless cycle of incidents and writing postmortems for breaches that the structural shifts in the industry had already made entirely predictable. We are moving toward a reality where the speed of your data security response must match the speed of the models finding the flaws, and those who can’t automate that loop will find themselves increasingly indefensible.
