Why Developers Use More AI but Trust It Less

Why Developers Use More AI but Trust It Less

Our SaaS and Software expert, Vijay Raina, is a specialist in enterprise SaaS technology and tools who provides thought-leadership in software design and architecture. We’re sitting down with him to unpack a perplexing trend highlighted in Stack Overflow’s recent developer survey: while AI tool adoption is soaring, developer trust in those same tools is in a freefall. This isn’t just a statistical curiosity; it’s a critical issue that impacts productivity, innovation, and the very future of software development. Vijay is here to help us understand the psychology behind this gap and explore how organizations can bridge it.

Recent surveys show a curious trend: while developer AI tool usage is soaring past 80%, trust in these tools has fallen below 30%. What do you see as the primary drivers of this growing gap, and what does it reveal about the current developer mindset?

It’s a fascinating and counterintuitive situation, isn’t it? We saw usage jump to over 84%, yet trust simultaneously plummeted to just 29%, which is a sharp 11-point drop from the previous year. You’d normally expect familiarity to breed confidence, but here we’re seeing the opposite. I believe this reveals something profound about the professional integrity of developers. They aren’t being resistant to change for the sake of it; they are applying the rigorous skepticism they’ve been trained for. They’re seeing the output, they’re using the tools, and they’re noticing every inconsistency, every failure, and every instance where the AI falls short of the high standards they hold for their own work. This gap isn’t a sign of Luddism; it’s a reflection of a culture that deeply values quality, security, and maintainability, and is rightly questioning whether this new paradigm can meet that bar.

Software engineers are trained for predictable, deterministic outcomes. Given that AI tools operate probabilistically, how does this fundamental difference create cognitive friction? Please share a few examples of how this manifests in a developer’s daily workflow and what mental shifts are required to adapt.

This is the absolute core of the friction. A developer’s entire world is built on determinism. You write a function, you test it, and you expect the same input to produce the same output, every single time. There’s a deep professional satisfaction in that predictability—it’s what makes it engineering rather than what I call “software hoping-for-the-best.” Then, AI enters the picture. It’s fundamentally probabilistic. You can ask it the same question twice and get two different, potentially correct, but structurally distinct answers. For a developer, this feels jarring and unreliable. In a daily workflow, this means you can’t just blindly accept a generated code snippet; you’re constantly second-guessing if a slightly different prompt would have yielded a more elegant or efficient solution. The mental shift required is enormous. It involves letting go of the need for a single “right” answer and learning to see the AI’s output as a distribution of possibilities that you, the engineer, must then evaluate and refine.

AI tools can generate plausible-looking but flawed code, creating a “discernment burden” for developers. How does this need for constant verification impact productivity promises? Can you walk me through a few practical workflows teams can implement to catch these “hallucinations” without slowing development?

The “discernment burden” is where the promise of skyrocketing productivity hits a hard wall. AI can produce code that looks polished and correct at a glance, but is riddled with subtle flaws—we see references to deprecated methods, non-existent APIs, or even quiet security vulnerabilities. If a developer has to spend just as much time verifying and debugging the AI’s work as it would have taken to write it from scratch, the productivity gain evaporates. To counter this without grinding to a halt, teams need to adapt their workflows. One practical approach is to intensify testing requirements specifically for AI-assisted code, focusing on edge cases where probabilistic models often stumble. Another is to adapt the code review process itself. A reviewer should always be informed which parts of a commit were AI-generated, allowing them to apply a different, more critical lens. Finally, you can start by using AI for lower-stakes tasks like generating boilerplate code or documentation, building trust and verification skills before moving to more critical components.

A developer’s uncertainty in their own prompting skills can sometimes be misinterpreted as a lack of trust in the tool itself. How can organizations differentiate between a tool problem and a training problem? What specific, hands-on training initiatives have you seen effectively build both competence and confidence?

This is a really insightful point because the two issues are so intertwined. A developer might think, “This tool is giving me garbage,” when the real issue is that their prompt lacked the necessary context or clarity. Organizations can start to differentiate this by tracking metrics beyond just usage. Are developers getting stuck? Is there a high rate of AI-generated code being discarded? These can be signals of a skills gap. To build both competence and confidence, generic training isn’t enough. I’ve seen great success with hands-on, domain-specific workshops where engineers work on real-world problems using the AI tools. Bringing in guest speakers who are expert prompters or setting up internal mentorship programs can be incredibly effective. The goal is to create a safe space for learning and to reframe prompting as a core engineering skill that, like any other, must be practiced and perfected. When a developer feels that uncertainty about their own ability fade, their trust in the tool’s potential naturally grows.

The text suggests reframing AI as a “junior developer” that needs supervision. How does this analogy change the code review process? Could you outline a step-by-step approach for reviewing AI-assisted code that maintains high standards for quality, security, and maintainability?

I love this analogy because it immediately sets the right expectations. You would never let a junior engineer’s code go into a critical production system without a thorough review, and the same standard must apply to AI. First, the process starts with transparency: the pull request should clearly flag which code was AI-generated. Second, the reviewer must approach this code not just by asking “Does it work?” but “Does it align with our architectural principles and coding standards?” They need to scrutinize it for maintainability—is this code clever but impossible for a human to debug later? Third, security checks must be intensified. Because the AI doesn’t understand the full context of your system, it can introduce subtle vulnerabilities. Finally, the reviewer should provide feedback as if they were mentoring that junior developer. This isn’t just about catching errors; it’s about guiding the tool and the prompter toward better outputs in the future. Humans must remain the ultimate owners and arbiters of quality.

Some organizations like Uber are building trust by connecting AI to curated, internal knowledge bases. How does this approach directly address concerns around accuracy and context? For a company starting from scratch, what are the first three steps to building a similar knowledge management infrastructure?

Uber’s approach with its Genie assistant is a brilliant model because it directly attacks the root of distrust: a lack of context-specific accuracy. By grounding the AI in a curated, human-verified internal knowledge base like their Stack Internal, they ensure the answers are not just statistically plausible but factually correct within Uber’s unique ecosystem. This builds immense trust because developers can see the attribution and trace an answer back to a reliable source. For a company starting from scratch, the first step is to establish a centralized platform for this knowledge—a “single source of truth.” Don’t let valuable information languish in siloed documents or Slack channels. Second, you must foster a culture of curation. This isn’t just about dumping data; it’s about encouraging experts to write, verify, and maintain high-quality documentation and answers. Third, implement a system with strong attribution and traceability from the very beginning, so when you do connect an AI, you can always show where the information came from. This foundation of human expertise is what makes the AI trustworthy and truly powerful.

Shadow AI, where employees use unapproved tools, poses a significant risk when proprietary code is involved. What are the key elements of a governance framework that can prevent this? Describe how it can balance security with the need for developers to experiment and innovate.

Shadow AI is a massive headache for security and privacy teams, especially when you see stats like 38% of employees admitting to feeding confidential data into unapproved systems. A strong governance framework is essential, but it can’t be a simple “no.” To balance security with innovation, the framework must first provide sanctioned, vetted tools. If you give developers a powerful and secure internal option, they are far less likely to seek risky external ones. Second, the framework needs clear guidelines, not just rules. Educate developers on the specific risks, like data exfiltration and intellectual property loss, so they understand the “why” behind the policies. Third, the approval process for new tools needs to be agile. If it takes six months to get a tool approved, developers will find workarounds. By creating a process that acknowledges AI’s unique risk profile but is still responsive, you create a partnership with your developers, allowing them to experiment within safe, sandboxed environments while protecting your most valuable assets.

What is your forecast for the AI trust gap over the next three to five years?

I’m optimistic that we will see the gap begin to close, but it won’t be a straight line up. Over the next three to five years, I predict the convergence of three key factors. First, the tools themselves will improve, becoming more context-aware and less prone to obvious hallucinations. Second, and more importantly, developers will become far more skilled. Prompting and AI output evaluation will be taught as fundamental parts of computer science, and the current generation of engineers will develop a deep, practical intuition for working with these systems. Third, organizations will get smarter. They’ll move beyond generic AI tools and invest in curated, internal knowledge systems like Uber’s, grounding AI in their own proprietary context. This combination of better tools, better skills, and better organizational strategy will transform the trust dynamic. The gap will shrink not because of blind faith, but because developers will have earned confidence in the tools—and, crucially, in their own ability to wield them effectively.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later