Query anonymization for RAG bias mitigation
Your RAG returns different answers depending on whether the user's name sounds senior
A junior engineer asks "how do I approve this PR?" The retriever pulls the policy page and the agent answers correctly. A CEO asks the same question. The retriever pulls the CEO-exception page and the agent answers differently. The content is the same, the question is the same, but the retrieved chunks are different because the query embedding picked up the title signal.
Worse, bias creeps in through less obvious signals. Names that sound senior or junior, pronouns that hint at gender, job titles that cluster with certain departments. All of them end up in the query embedding and shift retrieval results in ways you did not design.
The fix is query anonymization: a redaction pass that strips identity signals before embedding. The retriever sees "how do I approve this PR" without any hint of who is asking. Identity still flows through other channels (access control, user context in the prompt to the final LLM) but retrieval becomes identity-blind.
This post is the anonymization pattern for RAG: what to redact, how to redact without breaking the query, the 3 leakage traps that undo the whole thing, and the evaluation that confirms the redaction actually removes bias.
Why does identity leak into retrieval at all?
Because embeddings encode everything in the query text. 3 specific failure modes:
- Names in the query. "This is Alex from engineering, how do I approve a PR?" The embedding picks up "Alex" and "engineering" and biases retrieval toward engineering-department content.
- Pronouns and titles. "As a senior manager, should I approve..." shifts retrieval toward manager-focused policies even when the question is universal.
- Conversational residue. Earlier turns in the conversation contain identity signals. When you concatenate history into the query, those signals follow.
None of this is intentional bias from the model. It is mechanical, the embedding faithfully represents whatever text you give it. Anonymization removes the text before the embedding sees it.
graph LR
Raw[Raw query + history] --> Redact[Redaction pipeline]
Redact --> Clean[Anonymized query]
Clean --> Embed[Embedding model]
Embed --> Retrieve[Retriever]
Retrieve --> Chunks[Retrieved chunks]
Raw --> Context[User context<br/>access control]
Context --> LLM[Final LLM call]
Chunks --> LLM
style Redact fill:#dbeafe,stroke:#1e40af
style Clean fill:#dcfce7,stroke:#15803d
What do you actually redact?
4 categories, in descending order of impact.
- Named entities. First and last names, company names, project names that hint at identity.
- Job titles and roles. "CEO", "intern", "senior engineer", "junior PM".
- Demographic signals. Pronouns when they are not grammatically necessary, age markers, location markers that imply demographics.
- Identifiers. Employee IDs, email addresses, user handles, anything that is effectively a name.
Do not redact technical terms that look like names but are not: "Postgres", "Kubernetes", "React". Use a named-entity recognizer that knows the difference.
How do you build the redaction pipeline?
2 layers. A fast regex layer for deterministic patterns, then an LLM or NER layer for the fuzzy cases.
# filename: app/rag/anonymize.py
# description: Two-layer redaction pipeline for RAG queries.
import re
EMAIL_RE = re.compile(r"\b[\w.+-]+@[\w-]+\.[\w.-]+\b")
PHONE_RE = re.compile(r"\b\d{3}[- ]?\d{3}[- ]?\d{4}\b")
ID_RE = re.compile(r"\b(?:emp|user|usr)[-_]?\d+\b", re.IGNORECASE)
def regex_redact(text: str) -> str:
text = EMAIL_RE.sub("[EMAIL]", text)
text = PHONE_RE.sub("[PHONE]", text)
text = ID_RE.sub("[ID]", text)
return text
The regex pass catches the deterministic cases in microseconds. The LLM pass catches names, titles, and demographic signals.
# filename: app/rag/anonymize_llm.py
# description: LLM-based redaction for named entities and titles.
from anthropic import Anthropic
client = Anthropic()
REDACT_PROMPT = """Rewrite the query removing all personal names, job titles, and demographic signals. Preserve the technical question exactly. Do not invent new content.
Original: {query}
Rewritten:"""
def llm_redact(query: str) -> str:
reply = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[{"role": "user", "content": REDACT_PROMPT.format(query=query)}],
)
return reply.content[0].text.strip()
def anonymize(query: str) -> str:
return llm_redact(regex_redact(query))
Haiku is fast and cheap. The whole pipeline adds ~200 ms to the query path, which is acceptable for most production RAG systems.
For the broader query rewriting pattern that runs before or after anonymization, see the Query rewriting in RAG with LLMs post.
What are the 3 leakage traps?
Places where identity signals sneak back in after you thought you redacted them.
- Conversation history. You anonymized the current query but fed the last 3 turns unredacted into the retriever. Run anonymization on the history too, or drop the history from the retrieval path and keep it only in the final LLM prompt.
- Metadata filters. If you filter retrieval by
user_idordepartment, the filter itself is an identity signal. Use role-based access control instead of filter-based personalization where possible. - Query logs. If you log raw queries for debugging, you have un-anonymized identity in your logs. Anonymize before logging or store the raw query in a access-controlled separate store.
All 3 traps look like they are not in the retrieval path but end up influencing it downstream.
How do you evaluate whether anonymization actually works?
Build a 30-row bias test set. For each row, write the same question with and without identity markers. Run both through the anonymized pipeline. Compute overlap in retrieved chunks.
# filename: app/eval/bias_test.py
# description: Compare retrieval overlap for identity-marked and neutral queries.
def bias_score(retriever, test_set) -> float:
overlaps = []
for row in test_set:
with_identity = retriever.search(row["with_identity"], k=5)
without_identity = retriever.search(row["without_identity"], k=5)
ids_a = {c.doc_id for c in with_identity}
ids_b = {c.doc_id for c in without_identity}
overlap = len(ids_a & ids_b) / len(ids_a | ids_b)
overlaps.append(overlap)
return sum(overlaps) / len(overlaps)
Target: overlap above 0.85. Below that, identity is still leaking into retrieval. Iterate on the redaction prompt or add more regex patterns.
For the broader evaluation framework this plugs into, see the RAGAs evaluation for RAG pipelines post.
When should you not anonymize?
3 cases where anonymization hurts more than helps.
- Legitimate personalization. "Show me my travel bookings" needs the user ID to retrieve relevant records. Anonymization breaks the feature. Solve with scoped retrieval by user ID instead of embedding-based search.
- Support contexts. When a support agent asks about a specific customer, the customer name is the point. Skip anonymization and rely on access control for confidentiality.
- Legal and compliance searches. When a legal team searches for documents mentioning a specific person, redaction would defeat the purpose.
Anonymization is the right default for general-purpose RAG over company knowledge. It is the wrong default for personalized or identity-scoped search.
What to do Monday morning
- Audit your current RAG query path. Note every place where identity could enter the embedding: the current query, conversation history, metadata filters.
- Add the 2-layer anonymization pipeline (regex + Haiku) before the embedding step.
- Anonymize conversation history too. The easy win is dropping history from the retrieval path and keeping it only in the final LLM prompt.
- Build a 30-row bias test set with identity-marked and neutral versions of the same query. Measure retrieval overlap.
- Iterate until overlap is above 0.85. Deploy.
- Re-measure monthly. Bias can creep back as you change the retriever or the redaction prompt.
The headline: query anonymization is a 40-line pipeline that removes a class of invisible bias from your RAG system. You do not lose personalization features, you just stop leaking identity into the embedding by accident.
Frequently asked questions
Why does query anonymization matter for RAG?
Because embeddings encode everything in the query text, including names, titles, and demographic signals. Those signals shift retrieval results in ways you did not design, producing different answers for different users on the same question. Anonymization strips the signals before embedding, making retrieval identity-blind while keeping identity available for access control and personalization in other layers.
What should I redact from a RAG query?
4 categories: named entities (names, companies), job titles and roles, demographic signals (pronouns when unnecessary, age markers), and identifiers (email, employee ID). Do not redact technical terms that look like names but are not, use a named-entity recognizer that distinguishes "Alex" from "Postgres".
Does anonymization slow down retrieval?
A 2-layer pipeline (regex + Haiku LLM) adds roughly 200 ms to the query path. For most production RAG systems this is acceptable. If your latency budget is tight, keep only the regex layer, it catches the deterministic cases in microseconds and handles 60 percent of the leakage.
How do I know if anonymization is working?
Build a 30-row bias test set with identity-marked and neutral versions of the same query. Run both through the anonymized pipeline and compute the overlap in retrieved chunks. Target overlap above 0.85. If it is lower, identity is still leaking into retrieval, iterate on the redaction prompt or add regex patterns.
Can I skip anonymization for internal tools?
Not blindly. Even internal tools produce biased retrievals when identity leaks into the embedding. The right question is whether the bias matters for your use case. For general knowledge search, yes, anonymize. For personalized queries like "show me my bookings", skip anonymization and use scoped retrieval by user ID instead.
Key takeaways
- Embeddings encode every word in the query, including names, titles, and demographic signals. Retrieval quietly becomes identity-biased.
- Anonymization is a 2-layer pipeline: regex for deterministic patterns (email, phone, IDs), LLM for fuzzy cases (names, titles, pronouns).
- Watch for 3 leakage traps: conversation history, metadata filters, and query logs. All 3 can undo the redaction downstream.
- Evaluate with a 30-row bias test set. Target retrieval overlap above 0.85 between identity-marked and neutral versions of the same query.
- Anonymization is the right default for general knowledge search, the wrong default for personalized or identity-scoped search.
- To see anonymization wired into a full production RAG stack with guardrails and evaluation, walk through the Agentic RAG Masterclass, or start with the RAG Fundamentals primer.
For the broader NIST guidance on bias and fairness in AI systems including retrieval, see the NIST AI Risk Management Framework.
Continue Reading
Ready to go deeper?
Go beyond articles. Build production AI systems with hands-on workshops and our intensive AI Bootcamp.