
Keep Your RAG Knowledge Base Fresh: LlamaIndex Document Management for Real Apps
Chris Harper
3 min read
Jul 1, 2026 · 20:03 UTC
TL;DR: Your RAG pipeline's first job is ingestion; its second is staying current — use LlamaIndex's insert/update/delete/refresh_ref_docs API to add, change, and remove docs without rebuilding your whole index.
What you'll be able to do after this:
- Add new documents to an existing index without re-embedding everything from scratch
- Update or delete specific documents when content changes or goes stale
- Use
refresh_ref_docsto batch-update only the documents that have actually changed — saving embed API calls
Once your basic RAG pipeline works, you hit the real challenge: keeping it current. Source documents change, new ones arrive, old ones get revoked. Rebuilding the whole index from scratch is fine for demos — at production scale it's too slow and wastes money re-embedding unchanged content.
LlamaIndex's document management API gives you surgical control. Install and build your index with persistence so it survives between runs:
pip install llama-index llama-index-core
from llama_index.core import VectorStoreIndex, Document, StorageContext
from llama_index.core.storage.docstore import SimpleDocumentStore
# Build once with a persistent doc_id per document
docs = [
Document(text="Q4 earnings report content...", doc_id="doc-001"),
Document(text="Company policy document...", doc_id="doc-002"),
]
storage_ctx = StorageContext.from_defaults(docstore=SimpleDocumentStore())
index = VectorStoreIndex.from_documents(docs, storage_context=storage_ctx)
index.storage_context.persist("./storage")
Reload later and update without rebuilding:
from llama_index.core import load_index_from_storage
storage = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage)
# Add a new document
index.insert(Document(text="New policy added today", doc_id="doc-042"))
# Update content that changed (LlamaIndex deletes + re-embeds under the hood)
index.update_ref_doc(Document(text="Updated Q4 report — revised figures", doc_id="doc-001"))
# Delete a stale or revoked document
index.delete_ref_doc("doc-002", delete_from_docstore=True)
Batch refresh: only re-embed what changed
refresh_ref_docs is the workhorse for scheduled syncs. It hashes each document's content against what's stored — if unchanged, it skips the embed call entirely:
# Pass the current version of every document in your source
current_docs = [
Document(text=read_file("q4-report.md"), doc_id="doc-001"),
Document(text=read_file("policy.md"), doc_id="doc-002"),
]
refreshed = index.refresh_ref_docs(current_docs)
# refreshed[i] = True if doc i was re-indexed, False if unchanged
print(f"Re-indexed {sum(refreshed)}/{len(refreshed)} docs")
The production maintenance loop:
- Build the index once with
persist_dirset - Nightly job: call
refresh_ref_docswith the latest version of every source document - Hard deletes (revoked docs): call
delete_ref_docimmediately, don't wait for the nightly run - New documents:
index.insert()any time — no rebuild required
This turns your knowledge base from a one-time build artifact into a long-running service that stays synchronized with your source content.
Sources: LlamaIndex Document Management docs | LlamaIndex in Python — Real Python