Do I need GPUs to run this?

No. The whole stack runs on CPU. all-MiniLM-L6-v2 and the cross-encoder are small, fast models designed for CPU inference.

Why not just use a managed vector DB?

Qdrant runs locally in Docker, stays free, and supports dense plus sparse in one collection. You learn the primitives instead of renting them.

Is this production quality?

The architecture is. The sparse embedder is a simplified BM25-style hasher for teaching. For production, swap in SPLADE or a corpus-tuned BM25. Everything else maps directly to production systems.

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

Premium course

Hybrid search that finds what users actually asked for

Name: Hybrid document search with Qdrant and Sentence Transformers
Price: 24 USD
Availability: InStock

Pure vector search loses exact keywords. Pure keyword search loses meaning. You will build the hybrid retrieval stack that serious teams run in production: dense plus sparse, fused with RRF, reranked by a cross-encoder.

Enroll Preview curriculum

Still deciding? Ask first.

Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.

Curriculum fit, prerequisites, or where to start
Honest answer, no pressure to enroll

Engineers are learning here from

NVIDIAMICROSOFTGRABWISEPIPEDRIVEBOLTGLIA

Build a document Q&A system that combines dense semantic search with sparse keyword matching, fuses results with Reciprocal Rank Fusion, and boosts precision with a cross-encoder reranker. All running locally with Qdrant.

Hybrid retrieval with dense vectors, sparse keywords, RRF fusion, and cross-encoder reranking.

What you'll ship

Real projects, not toy demos.

Parse PDFs into clean markdown with PyMuPDF4LLM and chunk by headers
Generate dense embeddings with all-MiniLM-L6-v2 and sparse vectors with a BM25-style hasher
Store both vector types in a single Qdrant collection with hybrid indexing
Retrieve candidates with Reciprocal Rank Fusion across dense and sparse results
Rerank the top candidates with a cross-encoder for final precision

What you'll learn

You finish able to:

Parse messy PDFs into structured markdown with PyMuPDF4LLM
Split documents using header-aware and recursive splitters so chunks preserve context
Generate dense embeddings and BM25-style sparse vectors side by side
Configure a Qdrant collection with both dense and sparse vector support
Query Qdrant with Reciprocal Rank Fusion across two retrieval strategies
Rerank candidates with a cross-encoder to trade latency for precision

Curriculum

From raw PDFs to reranked hybrid results.

01
Document processing
Parse PDFs with PyMuPDF4LLM and chunk them so retrieval has context to work with
3 lessons
02
Hybrid retrieval
Generate dense plus sparse vectors, query Qdrant with RRF fusion, and rerank with a cross-encoder
3 lessons

Who it's for

Is this for you?

Backend engineers

who built a RAG demo and saw it fail the moment users typed exact product names or codes

ML engineers

moving from toy cosine similarity to the hybrid retrieval stack used in production

Search engineers

who know BM25 inside out and want to add semantic signals without throwing lexical search away

FAQ

Common questions.

Do I need GPUs to run this?
No. The whole stack runs on CPU. all-MiniLM-L6-v2 and the cross-encoder are small, fast models designed for CPU inference.
Why not just use a managed vector DB?
Qdrant runs locally in Docker, stays free, and supports dense plus sparse in one collection. You learn the primitives instead of renting them.
Is this production quality?
The architecture is. The sparse embedder is a simplified BM25-style hasher for teaching. For production, swap in SPLADE or a corpus-tuned BM25. Everything else maps directly to production systems.

Pricing

Unlock this course with Pro.

One subscription unlocks every paid course and workshop replay. Pick yearly or monthly.

Unlock with Pro

$30$16/mo

You save 47% with regional pricing

Billed annually. Cancel anytime.

This course plus every paid course
Workshop replays in your library
New releases the day they ship

Still deciding?

After this course:

Hybrid retrieval is the bar for serious search. Clear it.

Enroll

Hybrid document search with Qdrant and Sentence Transformers

From $16/mo with Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro

$30/mo$16/mobilled yearlyGet Pro

47% OFFYearly Pro$30/mo$16/mobilled yearlyGet Pro

Premium course

Hybrid search that finds what users actually asked for

Enroll Preview curriculum

Still deciding? Ask first.

Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.

Curriculum fit, prerequisites, or where to start
Honest answer, no pressure to enroll

Engineers are learning here from

NVIDIAMICROSOFTGRABWISEPIPEDRIVEBOLTGLIA

Hybrid retrieval with dense vectors, sparse keywords, RRF fusion, and cross-encoder reranking.

What you'll ship

Real projects, not toy demos.

Parse PDFs into clean markdown with PyMuPDF4LLM and chunk by headers
Generate dense embeddings with all-MiniLM-L6-v2 and sparse vectors with a BM25-style hasher
Store both vector types in a single Qdrant collection with hybrid indexing
Retrieve candidates with Reciprocal Rank Fusion across dense and sparse results
Rerank the top candidates with a cross-encoder for final precision

What you'll learn

You finish able to:

Parse messy PDFs into structured markdown with PyMuPDF4LLM
Split documents using header-aware and recursive splitters so chunks preserve context
Generate dense embeddings and BM25-style sparse vectors side by side
Configure a Qdrant collection with both dense and sparse vector support
Query Qdrant with Reciprocal Rank Fusion across two retrieval strategies
Rerank candidates with a cross-encoder to trade latency for precision

Curriculum

From raw PDFs to reranked hybrid results.

01
Document processing
Parse PDFs with PyMuPDF4LLM and chunk them so retrieval has context to work with
3 lessons
02
Hybrid retrieval
Generate dense plus sparse vectors, query Qdrant with RRF fusion, and rerank with a cross-encoder
3 lessons

Who it's for

Is this for you?

Backend engineers

who built a RAG demo and saw it fail the moment users typed exact product names or codes

ML engineers

moving from toy cosine similarity to the hybrid retrieval stack used in production

Search engineers

who know BM25 inside out and want to add semantic signals without throwing lexical search away

FAQ

Common questions.

Do I need GPUs to run this?
No. The whole stack runs on CPU. all-MiniLM-L6-v2 and the cross-encoder are small, fast models designed for CPU inference.
Why not just use a managed vector DB?
Qdrant runs locally in Docker, stays free, and supports dense plus sparse in one collection. You learn the primitives instead of renting them.
Is this production quality?
The architecture is. The sparse embedder is a simplified BM25-style hasher for teaching. For production, swap in SPLADE or a corpus-tuned BM25. Everything else maps directly to production systems.

Pricing

Unlock this course with Pro.

One subscription unlocks every paid course and workshop replay. Pick yearly or monthly.

Unlock with Pro

$30$16/mo

You save 47% with regional pricing

Billed annually. Cancel anytime.

This course plus every paid course
Workshop replays in your library
New releases the day they ship

Still deciding?

After this course:

Hybrid retrieval is the bar for serious search. Clear it.

Enroll

Hybrid document search with Qdrant and Sentence Transformers

From $16/mo with Pro