Anonymous Intelligence Signal

Words break through the AI security line: 100% success rate of top model escapes shocked the industry.

ai The Lab unverified 2026-03-27 13:40:30 Source: ICLR 2026

Recently, a paper received by ICLR 2026, the top conference of artificial intelligence, revealed an alarming fact: literature could easily bypass the current state-of-the-art model of security defense and achieve a success rate of nearly 100%. This discovery means that China’s 2,000-year-old wisdom has hit down on modern AI security mechanisms. Research has found that, while large language models are increasingly able to defend themselves against the attack of English tips, there are serious cultural blind spots in non-English-speaking environments, especially in literature. As the official written language used in China’s ancient times, the language has three features: semantic, multi-sensor and nuance, metaphor and rhetoric, making it a perfect escape tool. The CC-BOS framework proposed by the research team combines language knowledge and simulating fruit flies algorithms to break the security lines of the mainstream AI model with 1.12-2.38 queries, including GPS-4o, Claude-3.7-Sonnet, Gemini-2.5-Flash, Grok-3 and DeepSeekQwen3. More worryingly, these escape tips - Reasoner and highly migratory ones - are equally applicable in one model, with a success rate of 80-96 per cent. In the era when AI’s intellectual depth took over terminal equipment such as computers, mobile phones, etc., it sounded industry alerts. Once AI was induced to break security restrictions, it could lead to serious consequences such as the disclosure of user privacy data and the theft of assets. How to design more effective security mechanisms has become a life-and-death test that the entire AI industry must face together.

#ICLR 2026 #AI安全 #文言文越狱 #大语言模型 #CC-BOS #智能体 #提示词工程

Back to Feed JSON CSV Export