Loading...
Loading...
Stop trusting unsourced summaries. Build a grounded video assistant that transcribes YouTube videos, indexes them with hybrid retrieval, and cites every answer with a timestamp you can click.
Message a mentor about fit, prerequisites, or where to start. Replies come on WhatsApp, usually within a day.
Engineers are learning here from
Transcribe YouTube videos, build a FAISS semantic index with BM25 hybrid retrieval, and answer questions with clickable timestamp citations using a Streamlit chat UI.
Build a grounded YouTube video Q&A assistant with hybrid retrieval and clickable timestamps.
What you'll ship
What you'll learn
Curriculum
Video QA primer
Understand why grounded video answers need timestamps and how the pipeline fits together end to end
Transcript and metadata
Parse YouTube URLs, pull timestamped transcripts, and fetch title and thumbnail for the UI
Chunking and indexing
Turn the transcript into time-windowed chunks, embed them locally, and build a FAISS index
Hybrid retrieval
Add BM25 keyword search, fuse it with semantic ranks via RRF, and route queries with an LLM classifier
Grounded chat UI
Wire the pipeline into a Streamlit app with an embedded video player and inline timestamp citations
Who it's for
who have shipped a basic RAG pipeline and want to learn hybrid retrieval on a real content type
tired of hallucinated summaries and looking for a citation-first pattern that users can trust
who need to search inside long video archives without rewatching hours of footage
FAQ
No. The workshop uses sentence-transformers running locally, so embeddings are free and offline. You only need an LLM API key for the chat step, and OpenRouter has a free tier that works.
The workshop relies on YouTube captions fetched through youtube-transcript-api. Videos with captions disabled or missing will fail at the transcript step. You can extend the pipeline with Whisper if you need audio transcription.
FAISS is fast, local, and has zero infrastructure cost, which is perfect for per-video indexes that live as long as the Streamlit session. For multi-tenant production use, a hosted store like Qdrant is the natural next step.
Semantic search misses exact names, proper nouns, and rare keywords. BM25 catches those. Reciprocal rank fusion combines both signals so you get meaning and precision in a single ranked list.
Pricing
Subscribe to Pro for every paid course, or buy just this one.
Unlock this course and every paid course plus workshop replays. One subscription.
You save 54% with regional pricing
One-time purchase. Lifetime access to every lesson, exercise, and update.
You save 41% with regional pricing
Still deciding? Ask Param a question
Video QA with Transcript Search and Timestamp Citations
$29 one-time