CloudCodeTree LogoCloudCodeTree
AI NewsTutorialsAbout
CloudCodeTree Logo
CloudCodeTree
  • AI News
  • Tutorials
  • About
← Back to AI News
Keep Your RAG Knowledge Base Fresh: LlamaIndex Document Management for Real Apps

Keep Your RAG Knowledge Base Fresh: LlamaIndex Document Management for Real Apps

Chris Harper

3 min read

Jul 1, 2026 · 20:03 UTC

AI
Tutorial
RAG
Embeddings

TL;DR: Your RAG pipeline's first job is ingestion; its second is staying current — use LlamaIndex's insert/update/delete/refresh_ref_docs API to add, change, and remove docs without rebuilding your whole index.

What you'll be able to do after this:

  • Add new documents to an existing index without re-embedding everything from scratch
  • Update or delete specific documents when content changes or goes stale
  • Use refresh_ref_docs to batch-update only the documents that have actually changed — saving embed API calls

Once your basic RAG pipeline works, you hit the real challenge: keeping it current. Source documents change, new ones arrive, old ones get revoked. Rebuilding the whole index from scratch is fine for demos — at production scale it's too slow and wastes money re-embedding unchanged content.

LlamaIndex's document management API gives you surgical control. Install and build your index with persistence so it survives between runs:

pip install llama-index llama-index-core
from llama_index.core import VectorStoreIndex, Document, StorageContext
from llama_index.core.storage.docstore import SimpleDocumentStore

# Build once with a persistent doc_id per document
docs = [
    Document(text="Q4 earnings report content...", doc_id="doc-001"),
    Document(text="Company policy document...",    doc_id="doc-002"),
]
storage_ctx = StorageContext.from_defaults(docstore=SimpleDocumentStore())
index = VectorStoreIndex.from_documents(docs, storage_context=storage_ctx)
index.storage_context.persist("./storage")

Reload later and update without rebuilding:

from llama_index.core import load_index_from_storage

storage = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage)

# Add a new document
index.insert(Document(text="New policy added today", doc_id="doc-042"))

# Update content that changed (LlamaIndex deletes + re-embeds under the hood)
index.update_ref_doc(Document(text="Updated Q4 report — revised figures", doc_id="doc-001"))

# Delete a stale or revoked document
index.delete_ref_doc("doc-002", delete_from_docstore=True)

Batch refresh: only re-embed what changed

refresh_ref_docs is the workhorse for scheduled syncs. It hashes each document's content against what's stored — if unchanged, it skips the embed call entirely:

# Pass the current version of every document in your source
current_docs = [
    Document(text=read_file("q4-report.md"),  doc_id="doc-001"),
    Document(text=read_file("policy.md"),     doc_id="doc-002"),
]
refreshed = index.refresh_ref_docs(current_docs)
# refreshed[i] = True if doc i was re-indexed, False if unchanged
print(f"Re-indexed {sum(refreshed)}/{len(refreshed)} docs")

The production maintenance loop:

  1. Build the index once with persist_dir set
  2. Nightly job: call refresh_ref_docs with the latest version of every source document
  3. Hard deletes (revoked docs): call delete_ref_doc immediately, don't wait for the nightly run
  4. New documents: index.insert() any time — no rebuild required

This turns your knowledge base from a one-time build artifact into a long-running service that stays synchronized with your source content.

Sources: LlamaIndex Document Management docs | LlamaIndex in Python — Real Python