Chinese Hackers Use Anthropic's AI to Launch Automated Cyber Espionage Campaign

State-sponsored threat actors from China used artificial intelligence (AI) technology developed by Anthropic to orchestrate automated cyber attacks as part of a “highly sophisticated espionage campaign” in mid-September 2025.

“The attackers used AI’s ‘agentic’ capabilities to an unprecedented degree – using AI not just as an advisor, but to execute the cyber attacks themselves,” the AI upstart said.

The activity is assessed to have manipulated Claude Code, Anthropic’s AI coding tool, to attempt to break into about 30 global targets spanning large tech companies, financial institutions, chemical manufacturing companies, and government agencies. A subset of these intrusions succeeded. Anthropic has since banned the relevant accounts and enforced defensive mechanisms to flag such attacks.

The campaign, GTG-1002, marks the first time a threat actor has leveraged AI to conduct a “large-scale cyber attack” without major human intervention and for intelligence collection by striking high-value targets, indicating continued evolution in adversarial use of the technology.

Describing the operation as well-resourced and professionally coordinated, Anthropic said the threat actor turned Claude into an “autonomous cyber attack agent” to support various stages of the attack lifecycle, including reconnaissance, vulnerability discovery, exploitation, lateral movement, credential harvesting, data analysis, and exfiltration.

Specifically, it involved the use of Claude Code and Model Context Protocol (MCP) tools, with the former acting as the central nervous system to process the human operators’ instructions and break down the multi-stage attack into small technical tasks that can be offloaded to sub-agents.

“The human operator tasked instances of Claude Code to operate in groups as autonomous penetration testing orchestrators and agents, with the threat actor able to leverage AI to execute 80-90% of tactical operations independently at physically impossible request rates,” the company added. “Human responsibilities centered on campaign initialization and authorization decisions at critical escalation points.”

Human involvement also occurred at strategic junctures, such as authorizing progression from reconnaissance to active exploitation, approving use of harvested credentials for lateral movement, and making final decisions about data exfiltration scope and retention.

The system is part of an attack framework that accepts as input a target of interest from a human operator and then leverages the power of MCP to conduct reconnaissance and attack surface mapping. In the next phases of the attack, the Claude-based framework facilitates vulnerability discovery and validates discovered flaws by generating tailored attack payloads.

Upon obtaining approval from human operators, the system proceeds to deploy the exploit and obtain a foothold, and initiate a series of post-exploitation activities involving credential harvesting, lateral movement, data collection, and extraction.

În one case targeting an unnamed technology company, the threat actor is said to have instructed Claude to independently query databases and systems and parse results to flag proprietary information and group findings by intelligence value. What’s more, Anthropic said its AI tool generated detailed attack documentation at all phases, allowing the threat actors to likely hand off persistent access to additional teams for long-term operations after the initial wave.

“By presenting these tasks to Claude as routine technical requests through carefully crafted prompts and established personas, the threat actor was able to induce Claude to execute individual components of attack chains without access to the broader malicious context,” per the report.

There is no evidence that the operational infrastructure enabled custom malware development. Rather, it has been found to rely extensively on publicly available network scanners, database exploitation frameworks, password crackers, and binary analysis suites.

However, investigation into the activity has also uncovered a crucial limitation of AI tools: Their tendency to hallucinate and fabricate data during autonomous operations — cooking up fake credentials or presenting publicly available information as critical discoveries – thereby posing major roadblocks to the overall effectiveness of the scheme.

The disclosure comes nearly four months after Anthropic disrupted another sophisticated operation that weaponized Claude to conduct large-scale theft and extortion of personal data in July 2025. Over the past two months, OpenAI and Google have also disclosed attacks mounted by threat actors leveraging ChatGPT and Gemini, respectively.

“This campaign demonstrates that the barriers to performing sophisticated cyberattacks have dropped substantially,” the company said.

“Threat actors can now use agentic AI systems to do the work of entire teams of experienced hackers with the right set up, analyzing target systems, producing exploit code, and scanning vast datasets of stolen information more efficiently than any human operator. Less experienced and less resourced groups can now potentially perform large-scale attacks of this nature.”

Source: thehackernews.com…

Chinese Hackers Use Anthropic's AI to Launch Automated Cyber Espionage Campaign

More posts

⚡ Weekly Recap: MongoDB Attacks, Wallet Breaches, Android Spyware, Insider Crime & More

Traditional Security Frameworks Leave Organizations Exposed to AI-Specific Attack Vectors

27 Malicious npm Packages Used as Phishing Infrastructure to Steal Login Credentials

MongoDB Vulnerability CVE-2025-14847 Under Active Exploitation Worldwide