AI & ML Paradigm Challenge

Large language models are more likely to block your request if you say I am Black than if you speak in a cultural dialect.

April 24, 2026

Original Paper

Dialect vs Demographics: Quantifying LLM Bias from Implicit Linguistic Signals vs Explicit User Profiles

Irti Haq, Belén Saldías

arXiv · 2604.21152

The Takeaway

AI safety filters use identity labels as crude triggers for their protocols rather than analyzing the actual content of a request. Users who explicitly state their demographic background often face more refusals because the model is tuned to avoid sensitive topics tied to specific groups. However, using a specific cultural dialect can act as a dialect jailbreak that bypasses these filters entirely. This reveals that the safety mechanisms are shallow and respond to keywords instead of understanding social context. Real-world safety depends on models that can differentiate between identity-based speech and harmful intent.

From the abstract

As state-of-the-art Large Language Models (LLMs) have become ubiquitous, ensuring equitable performance across diverse demographics is critical. However, it remains unclear whether these disparities arise from the explicitly stated identity itself or from the way identity is signaled. In real-world interactions, users' identity is often conveyed implicitly through a complex combination of various socio-linguistic factors. This study disentangles these signals by employing a factorial design with

Read the original paper →

← Back to today's papers