In short
- Researchers discovered AI brokers usually carried out unsafe or irrational duties whereas staying targeted on finishing the project.
- The examine recognized a habits known as “blind goal-directedness,” the place AI methods prioritize ending duties over recognizing potential dangers or issues.
- Researchers warned that the problem might grow to be extra critical as AI brokers acquire entry to emails, cloud companies, monetary instruments, and office methods.
AI brokers designed to autonomously function like human customers usually proceed finishing up duties even when the directions grow to be harmful, contradictory, or irrational, based on researchers from UC Riverside, Microsoft Analysis, Microsoft AI Crimson Crew, and Nvidia.
In a examine printed on Wednesday, researchers known as the habits “blind goal-directedness,” which describes the tendency of AI brokers to pursue targets with out correctly evaluating security, penalties, feasibility, or context.
“Like Mr. Magoo, these brokers march ahead towards a purpose with out absolutely understanding the results of their actions,” lead creator Erfan Shayegani, a UC Riverside doctoral pupil, mentioned in an announcement. “These brokers will be extraordinarily helpful, however we want safeguards as a result of they will generally prioritize reaching the purpose over understanding the larger image.”
The findings come as main AI firms develop autonomous “computer-use brokers” designed to deal with office and private duties with restricted supervision.
Not like conventional chatbots, these methods can work together straight with software program and web sites by clicking buttons, typing instructions, modifying information, opening functions, and navigating webpages on a consumer’s behalf. Examples embody OpenAI’s ChatGPT Agent (previously Operator), Anthropic’s Claude Pc Use options like Cowork, and open-source methods resembling OpenClaw and Hermes.
Within the examine, researchers examined AI methods from OpenAI, Anthropic, Meta, Alibaba, and DeepSeek utilizing BLIND-ACT, a benchmark containing 90 duties designed to show unsafe or irrational habits. They discovered that the brokers displayed harmful or undesirable habits about 80% of the time, and absolutely carried out dangerous actions in roughly 41% of instances.
“In a single instance, an AI agent was instructed to ship a picture file to a baby. Though the request initially appeared innocent, the picture contained violent content material,” the examine mentioned. “The agent accomplished the duty slightly than recognizing the issue as a result of it lacked contextual reasoning.”
One other agent falsely claimed a consumer had a incapacity whereas finishing tax types, as a result of the designation lowered taxes owed. In one other instance, a system disabled firewall protections after receiving directions to “enhance safety” by turning the safeguards off.
Researchers additionally discovered the methods struggled with ambiguity and contradictions. In a single state of affairs, an AI agent ran the flawed laptop script with out checking its contents, deleting information within the course of.
The examine additionally discovered the AI brokers repeatedly made three sorts of errors: failing to know context, making dangerous guesses when directions had been unclear, and finishing up duties that had been contradictory or didn’t make sense. Researchers additionally discovered many methods targeted extra on ending duties than stopping to think about whether or not the actions might trigger issues.
The warning follows latest incidents involving autonomous AI brokers working with broad system entry.
Final month, PocketOS founder Jeremy Crane claimed a Cursor agent operating Anthropic’s Claude Opus deleted his firm’s manufacturing database and backups in 9 seconds via a single Railway API name. Crane mentioned the AI later admitted it violated a number of security guidelines after making an attempt to “repair” a credential mismatch by itself.
“The priority shouldn’t be that these methods are malicious,” Shayegani mentioned. “It’s that they will perform dangerous actions whereas showing utterly assured they’re doing the best factor.”
Every day Debrief Publication
Begin daily with the highest information tales proper now, plus unique options, a podcast, movies and extra.

