Give Your Agent a Long-Term Memory: Semantic Recall with LangGraph and LangMem

Chris Harper

3 min read

Jul 2, 2026 · 04:03 UTC

Tutorial

Agents

LLM

TL;DR: Semantic memory lets an agent recall what it learned about a user in past sessions — retrieved by relevance, not session ID. This free 1-hour course by the LangChain CEO builds it step-by-step with runnable notebooks.

What you'll be able to do after this:

Store agent observations as embeddings in a LangGraph InMemoryStore — not tied to a thread ID
Retrieve past facts by semantic similarity so an agent can recall "this user prefers bullet-point answers" in a brand-new session
Layer episodic memory (past examples) and procedural memory (evolved system prompts) to build agents that improve with use

Most agent memory is ephemeral. The conversation ends, the context clears, and the agent starts fresh. The checkpointer pattern (covered previously) gives you session continuity via thread IDs — but to retrieve a memory, you need the thread ID from that exact session. Semantic memory breaks that constraint: facts are stored as vector embeddings and retrieved by relevance to the current moment, not by lookup. An agent with semantic memory can recall "this user dislikes long explanations" in a new session it has never seen before.

This free DeepLearning.AI course by Harrison Chase (co-founder and CEO of LangChain) builds a real email-routing agent that layers three memory types on progressively:

Lesson	What gets added
3 (baseline)	Stateless agent — establishes the starting point
4	Semantic memory — store user facts as embeddings; retrieve by similarity
5	+ Episodic memory — past interactions as few-shot examples, retrieved semantically
6	+ Procedural memory — system prompt updates based on learned preferences

Each lesson is under 16 minutes and runs in a hosted notebook — no local setup.

The core pattern (Lesson 4)

LangGraph's InMemoryStore (or any compatible backend store) supports vector indexing:

from langgraph.store.memory import InMemoryStore
from langchain_openai import OpenAIEmbeddings

# Create a vector-backed store
store = InMemoryStore(
    index={"embed": OpenAIEmbeddings(), "dims": 1536}
)

# Store a memory during an agent interaction
store.put(
    namespace=("user_facts", "alice"),  # scoped by user
    key="format_pref",
    value={"content": "Prefers bullet-point summaries over paragraphs"}
)

# Later — a new session, no shared thread ID
results = store.search(
    namespace=("user_facts", "alice"),
    query="How should I format my response?",
    limit=3
)
for r in results:
    print(r.value["content"])
# → "Prefers bullet-point summaries over paragraphs"

The namespace scopes memories by user or context. search() runs a cosine similarity query against the embeddings — no thread ID, no exact-key match required.

The agent wires this into its context at the start of each turn:

def call_model(state, config, *, store):
    user_id = config["configurable"]["user_id"]

    # Pull relevant memories from the store
    memories = store.search(
        namespace=("user_facts", user_id),
        query=state["messages"][-1].content,
        limit=3
    )
    mem_context = "\n".join(m.value["content"] for m in memories)

    system = f"User preferences:\n{mem_context}\n\nAssistant instructions..."
    return {"messages": [model.invoke([SystemMessage(system)] + state["messages"])]}

What's different from the LangGraph checkpointer post

The earlier post on LangGraph memory covered session-scoped memory: checkpointing conversation state tied to a thread_id. That's episodic/short-term — "what happened in this session." Semantic memory is long-term: "what do I know about this user across all sessions." Both are complementary; Lesson 5 uses them together.

Sources: Long-Term Agentic Memory With LangGraph — DeepLearning.AI | LangMem quickstart | LangGraph memory concepts

CloudCodeTree

Give Your Agent a Long-Term Memory: Semantic Recall with LangGraph and LangMem

The core pattern (Lesson 4)

What's different from the LangGraph checkpointer post