Most RAG pipeline content stops at batch. Embed your corpus, build your index, query it. Clean, simple, done.

That's not production. Production has data arriving continuously. Source systems changing. Vectors going stale while users are querying them. The moment you need freshness in your index — real freshness, not a nightly rebuild — you're in near real time territory, and the architecture gets meaningfully more complex.

This series is about that complexity. Not the happy path — the seams. Auto Loader to Structured Streaming to foreachBatch to LanceDB. Each arrow on the whiteboard is a handoff. Each handoff is where the friction actually lives. Six parts covering the spike that validated the approach, the implementation decisions that made it work in production, and the honest retrospective on what to do differently next time.