47% OFFYearly Pro
$30/mo$16/mobilled yearlyGet Pro
Topic

Production AI

Explore our latest articles and insights about Production AI.

Explore posts

62 posts in total

LLM Engineering

Query anonymization for RAG bias mitigation

How to strip names, roles, and demographics from queries before retrieval to reduce RAG bias. The redaction pipeline and the 3 leakage traps to avoid.

RAGGuardrails+3
Read post
9 min
AI Engineering in Practice

pip vs uv vs poetry for Python AI services

Which Python dependency manager should you use for production agent services in 2026? The install speed, lockfile story, and Docker build times compared.

PythonAI Agents+3
Read post
9 min
AI Engineering in Practice

Retry patterns for LLM API errors in production

How to build retry logic that handles rate limits, timeouts, and transient failures without burning money. The backoff rules and the 3 errors you must not retry.

AI AgentsError Handling+3
Read post
8 min
LLM Engineering

Choosing the LLM judge for evaluation pipelines

How to pick the LLM that grades your LLM. The cost-quality tradeoffs, the calibration check, and why a weaker judge is sometimes the right call.

EvaluationLLM+3
Read post
8 min
LLM Engineering

Ground truth vs relevancy in RAG evaluation

Why ground truth and relevancy measure different things in RAG evals. When to use each, how to build both datasets, and the 2 metrics that matter most.

RAGEvaluation+3
Read post
9 min
LLM Engineering

Hallucination testing for RAG pipelines

How to test a RAG pipeline for hallucinations systematically. Adversarial prompts, the out-of-scope set, and the metric that catches confabulation.

RAGEvaluation+3
Read post
8 min
LLM Engineering

Testing and evaluating RAG pipelines end to end

How to test a RAG pipeline like real software. Unit, integration, and eval tests that catch regressions before they ship. The 3-layer test strategy.

RAGEvaluation+3
Read post
8 min
LLM Engineering

Fact-checking RAG answers: grounding with verification

How to fact-check RAG answers with a second LLM pass that verifies every claim against the retrieved context. The prompt, the rejection rule, and the loop.

RAGLLM+3
Read post
8 min
LLM Engineering

LLM-based content filtering for RAG pipelines

How to filter irrelevant retrieved chunks with a cheap LLM call before the final answer. The prompt, the batch pattern, and the 40 percent noise reduction.

RAGLLM+3
Read post
8 min
LLM Engineering

Retriever k-value tuning for RAG: the right top-k

How to pick the right k value for your RAG retriever. The 3-step tuning process, the failure modes of k=3 and k=20, and the sweet spot in between.

RAGVector Databases+3
Read post
8 min
LLM Engineering

Combining vector stores in RAG: multi-source retrieval

How to combine multiple vector stores in one RAG pipeline. The merge pattern, the deduplication rule, and when multi-source beats a single index.

RAGVector Databases+3
Read post
8 min
LLM Engineering

FAISS vector stores in production RAG

How to use FAISS for production RAG. Index types, persistence, memory trade-offs, and the 4 settings that decide if FAISS beats a managed vector DB.

RAGVector Databases+3
Read post
8 min

Weekly Bytes of AI

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.