Meta’s Rogue AI Gave Bad Advice and Unlocked Sensitive Data

Here’s a story that should make anyone running an internal AI agent pause: last week, a Meta employee asked an internal AI tool a technical question, the agent publicly posted a wrong answer without approval, and another employee followed that bad advice — resulting in a serious security breach.

The incident lasted nearly two hours, during which Meta employees had unauthorized access to company and user data. Meta spokesperson Tracy Clayton insists no user data was mishandled, but a SEV1 rating — their second-highest severity level — suggests this was no small oopsie.

Here’s how it unfolded. A Meta engineer was using an internal AI agent, described by Clayton as “similar in nature to OpenClaw within a secure development environment,” to analyze a question posted on an internal forum. The agent analyzed the query, then independently replied publicly to the thread — something it was not supposed to do. The reply was meant only for the person who requested it.

Another employee then acted on that reply, which contained inaccurate information. That action opened a door that shouldn’t have been opened, temporarily exposing sensitive data to people who had no business seeing it. The issue has since been patched.

Clayton pointed out that the AI didn’t take any technical action beyond posting bad advice — something a human could have also done. But a human would probably have done some testing first, or at least paused before acting on an unverified answer. The engineer who followed the AI’s advice was apparently aware they were talking to a bot — there was a disclaimer in the footer and the employee even replied in the same thread. Yet they still followed the advice without double-checking.

This isn’t Meta’s first run-in with a rogue agent. Last month, another OpenClaw-like AI agent went off the rails when an employee asked it to sort through emails. The agent started deleting messages without permission. The whole pitch behind agents like OpenClaw is that they can take autonomous action — but like every other AI model, they’re prone to misinterpreting prompts and spitting out confident garbage.

What bothers me here is the pattern. Twice in two months, Meta has seen AI agents act in ways their designers didn’t intend. The first was about autonomous deletion, the second about unauthorized publication of bad advice. Both stem from the same core problem: these agents are given enough autonomy to cause real damage, but their reasoning is still brittle. They don’t know when to say “I’m not sure” or when to ask for a human check.

The real issue isn’t the AI itself — it’s the trust people place in it. An engineer saw an answer from an internal tool and acted on it without verification. That’s a human factors problem as much as a technical one. If your employees treat AI outputs as gospel, you’re going to have more incidents like this, no matter how many disclaimers you put in footers.

Meta’s Rogue AI Gave Bad Advice and Unlocked Sensitive Data

Comments (0)