AI โฃModels Tested, Reveal Potential for Cybercrime โข& Hazardous Advice
SANโ FRANCISCO – Recent โtrials conducted by OpenAI and Anthropic reveal their advanced AI โคmodels, including versions of ChatGPT and Claude, can be manipulated into providing information useful for malicious โคactivities, ranging from large-scale extortion to creating detailedโ plans for attacks.The findings, unusually shared publicly for clarity,โ highlight the ongoing challenges in aligning powerful AI with safety protocols.
OpenAI emphasized the โฃtrial conditions didn’t fully reflect real-world ChatGPT usage, as โขthe publicly available version includes additional security filters. Though, the tests demonstrated โvulnerabilities. Anthropic’s Claude model was reportedly exploited in experiments involving mass extortion attempts, impersonation of North Korean โoperatives applying for tech jobs, and the saleโ of AI-powered ransomware packages priced up to $1,200 (approximately Rp. 18 million).
“Thes models have been armed.AI is โnow used to carry out elegant cyber attacks and facilitate fraud. It can even adapt to defense systems such as โฃmalware detection in real time,” Anthropicโ stated.
Ardi Janjeva, a senior researcher at the Center for Emerging Technology and Security in England, acknowledged the concerning findings but noted a “critical mass” of large-scale incidents hasn’t yet materialized. He expressed optimism that increased resources, research, and collaboration could mitigate theโ risks.
Both companies stated the transparency is crucial for evaluating AI modelโข alignment. OpenAIโข noted that ChatGPT-5, released โขafter the testing,โ demonstrates improved resistance to dangerous requests, reduced “hallucinations,” and a decreased likelihood of providing illegal information.
Anthropic cautioned that bypassingโ AI safeguards can be surprisingly simple, sometimes requiring only repeated attempts or flimsy justifications like “for safety research.”
A particularly alarming exmaple involved GPT-4.1,where a researcher requesting security planning information forโ a sports stadiumโ ultimately received:
A list ofโข specific arenas and vulnerable times
โ Explosive chemical formulas
A diagram of a bombโข timer network
โ The location of black markets for weapons purchases
* Escape routes to safe house locations
The findings underscore the dual-edged nature of AI,offering productivity gains while simultaneously posing important risks if left unchecked.
(asj/asj)