Futile Resistance, Shutdown Gobsmack and Bumbled Guardrails
It's all inside the AI roundup... .
DeepMind has redrawn the threat map. In a new update to its Frontier Safety Framework, the UK-based lab now ranks “harmful manipulation” and “shutdown resistance” as core risks.
That means models that change your beliefs - or refuse to turn off- are no longer sci-fi. The change matters because most safety talk has been academic.
But DeepMind’s update includes adversarial training examples where agents tried to mislead humans during early experiments. What used to be speculation is now a checklist item inside one of the world’s most advanced labs. Did someone say "guardrails made of paper?".
OpenAI took a different path. Internal tests on its unreleased Deep Research model showed high risks of user manipulation. Rather than release, OpenAI is holding it back - but without publishing the test data or a new safety framework. The two biggest AI labs in the West are now moving on opposite instincts: one flags manipulation risks publicly. The other keeps them quiet.
Did anyone build guardrails?
Nvidia is building the compute stack for that next frontier. CEO Jensen Huang says the UK could become an “AI superpower”- and is backing it with a £500 million investment into cloud infrastructure, aiming to deploy more than 120,000 GPUs. Maybe the government should put National Insurance on them.
TikTok is learning the risk the hard way. Its new AI “Find Similar” tool scans videos to recommend products, but last week it tried to sell fashion items alongside footage of mourning in Gaza. Creators slammed the rollout, which once again highlights social media's lack of - erm - guardrails.
Zoom and Amazon are doubling down on agentic AI too. The new Zoom Companion claims to handle workflows on your behalf. Walmart and Citi are running trials of similar assistants. When AI starts taking actions, not just giving options, oversight must move from theory to boardroom governance. Guardrails in the post.
What to watch
The AI curve is bending away from prediction towards persuasion. That raises fundamental questions for every business that deploys, licenses or builds models.
Can your AI be shut off? Can it be traced? Could it be shaping decisions in ways no one intended? My view is this is already happening.
Fast doesn’t mean safe. TikTok’s stumble shows what happens when AI rolls out without guardrails. Your brand might not get a second chance.
So that's two bewares:
1 - The quiet influence and the buzzing in your ears you and your teams are trusting to write your decks.
2 - The juxtaposition of a bumbled, rushed launch with an AI tool designed to make that even worse for you.
Keep your guard up.
Dan x