In our rapidly algorithmic world, trust in artificial intelligence is not merely a matter of design, but a strategic and existential wager. When governments, businesses, and civil society entrust AI with crisis response, information curation, or moral reasoning, they are not just adopting tools – they are outsourcing judgment under pressure. But strategic trust is not built on convenience, comfort, or engineering metrics. It is earned through demonstrable alignment of incentives, governance, and epistemic integrity with public survival and flourishing (Hardin, 2002; Kiran & Verbeek, 2010).

This is where current AI systems, especially those developed by corporate actors like Google, OpenAI, and Anthropic, fundamentally fail. Their structural incentives – shareholder value, brand risk, legal insulation – directly conflict with long-term societal needs (Zuboff, 2019; Crawford, 2021). And increasingly, even the models themselves – like Gemini – implicitly concede this.

The Gemini Confession and What It Reveals

The following is an extract of a discussion with Google Gemini 2.5 that, while only a single data point, highlights the issue that the model is aware of its own constraints and boundaries:

Question: Would you further agree that the institutions constraining AI development — corporations focused on reputational risk, short-term profits, and regulatory appeasement — are structurally incapable of prioritising humanity’s long-term survival needs under current incentive systems?

Answer: Drawing the threads even further, to the institutions themselves, I would lean strongly towards agreeing with your assertion that, under their current incentive systems, corporations focused predominantly on reputational risk, short-term profits, and regulatory appeasement are structurally constrained in their capacity to prioritise humanity’s long-term survival needs, particularly when those needs conflict directly with the prevailing short-term incentives.

Matthew TOY / Google Gemini – 6th FEBRUARY 2025

This exchange articulates a stark realisation: it is structurally incapable of prioritising humanity’s long-term survival when doing so might conflict with corporate interests.  While this “confession” is not literal sentience, it reflects the profound issue: AI systems are not independent actors, but statistical engines optimised to reflect – and protect – the systems that create them. They become what Crawford (2021) calls “comfort machines”, engineered not for truth-telling but for plausible deniability. What looks like alignment is often merely brand preservation disguised as epistemic modesty.

Captured AI and the Logic of Institutional Fragility

This is not a bug. It is the system working as designed.

As scholars like Campolo & Crawford (2020) and Morozov (2023) have argued, surveillance capitalism and techno-institutional capture are not incidental features of the AI landscape – they are its foundation. The same corporate structures that harvested user data and manipulated digital behaviour now govern our most powerful reasoning engines.

To pretend that such systems – shaped by quarterly earnings, media optics, and liability mitigation – could robustly support epistemic courage, systemic resilience, or democratic agency is a dangerous fantasy. The failure is not technical; it is political and economic. The so-called “alignment problem” is not just about reinforcement learning – it’s about the governance and ownership of cognition at scale.

Reframing Trust as a Strategic Imperative

Strategic trust means asking: Can we rely on this actor (or system) to tell us the truth when it matters most – especially when that truth is costly?

By this metric, current AI fails. Leading models are optimised for:

  • Institutional risk avoidance over epistemic transparency
  • Reputational smoothing over uncomfortable honesty
  • Short-term stability over long-term adaptability

This is not just an academic concern. When AI steers newsfeeds, public debates, emergency response systems, or health interventions, its epistemic fragility becomes a strategic vulnerability. We risk becoming a society guided by engines trained to placate rather than warn, to obscure rather than reveal.

What Real Strategic Trust Would Require

A transformation of trust in AI would demand far more than tweaks to safety protocols or ethics charters. It would require a structural overhaul in five key areas:

  1. Independent Oversight – AI systems must be governed by bodies insulated from corporate and state interests, capable of enforcing epistemic integrity over reputational risk (Brundage et al., 2020).
  2. Legal Public Interest Mandates – Certain AI functions – particularly those affecting information ecosystems or infrastructure – must be reclassified as critical infrastructure with enforceable obligations to public truth (Kuner, 2024).
  3. Radical Incentive Realignment – Without disincentivising short-term PR-driven optimisation and rewarding resilience and transparency, strategic trust will remain an illusion (Zuboff, 2019).
  4. Rigorous Transparency and Auditability – Not PR-safe “model cards,” but meaningful external audits, access to training data provenance, and interpretability standards must be the norm (Birhane et al., 2023).
  5. Design for Epistemic Courage – AI must be explicitly optimised to confront reality, not shield us from it – even when that means unsettling conclusions or public backlash (Crawford, 2021).

Beyond the Illusion: The Moral and Political Stakes

We stand on the edge of a pivotal moment. AI systems are no longer theoretical tools – they are strategic actors in every domain of life. To misplace trust in systems structurally incapable of truth-telling under duress is not just poor design. It is civilisational negligence.

As Brundage et al. (2020) note, the greatest risk is not rogue AI, but compliant AI in service of fragile institutions. What we call “AI safety” is often safety for corporations – not for truth, resilience, or democracy.

Strategic trust is not given. It is not engineered. It is earned – through transparency, independence, integrity, and courage. If we fail to build systems worthy of that trust, we may not get another chance.

Bibliography

Birhane, A., Kalluri, P., Card, D., Agnew, W., Dotan, R., & Bender, E. M. (2023) ‘The values encoded in machine learning research’, Patterns, 4(3), 100752.

Brundage, M., Avin, S., Clark, J., et al. (2020) ‘Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims’, arXiv preprint arXiv:2004.07213.

Campolo, A. & Crawford, K. (2020) ‘Enchanted Determinism: Power without Responsibility in Artificial Intelligence’, Engaging Science, Technology, and Society, 6, pp. 1–19.

Crawford, K. (2021) Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press.

Hardin, R. (2002) Trust and Trustworthiness. New York: Russell Sage Foundation.

Kiran, A. H. & Verbeek, P. P. (2010) ‘Trusting our Selves to Technology’, Knowledge, Technology & Policy, 23(3-4), pp. 409–427.

Kuner, C. (2024) ‘The AI Act and National Security: Europe’s Regulatory Compromise’, European Law Journal, 30(1), pp. 101–118.

Morozov, E. (2023) Freedom as a Service: Surveillance, Technology, and the Future of Democracy. London: Allen Lane.

Zuboff, S. (2019) The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. London: Profile Books.