What is it about?
SJBP demonstrates a new jailbreak method, which tricks LLMs by scattering harmful text across a layout. Instead of hiding intent behind clever words, it breaks a malicious message into fragments and scatters them physically across space, such as using an acrostic poem where the hidden message only appears as the first word of each line. It also built in a defense tool called SpatialD to automatically catch and visualize these hidden threats.
Featured Image
Photo by Aerps.com on Unsplash
Why is it important?
This study exposes a fundamental blind spot in how modern artificial intelligence processes information. While current AI safety guardrails are excellent at understanding context, they are virtually blind to text arrangement. The research shows that SpatialJB achieves an alarming 95% success rate on GPT-4 and 98% on DeepSeek-R1, while standard defense systems catch less than 12% of these attacks. SJBP provides an interactive way for researchers to test these spatial flaws and compare superior defenses against guardrails.
Perspectives
We use SJBP to show that AI safety shouldn't just focus on what words mean, but also on how text is arranged. By letting users interact with both the attack and defense, we hope to inspire stronger, layout-aware security standards for web AI.
Zhiyi Mou
Zhejiang University
Read the Original
This page is a summary of: SJBP: A Platform to Launch a Novel Jailbreak Attack on Large Language Models Based on Content's Spatial Distribution, May 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3774905.3793104.
You can read the full text:
Contributors
The following have contributed to this page







