Add Semantic Search to Your Existing Postgres: pgvector + Supabase with HNSW Indexes

Chris Harper

3 min read

Jul 3, 2026 · 04:02 UTC

Tutorial

Vectors

Embeddings

TL;DR: pgvector turns any Postgres table into a vector store; HNSW indexes keep similarity queries under 10ms at scale — no separate vector database required.

What you'll be able to do after this:

Add a vector column to an existing Postgres table and store embeddings alongside your application data
Build an HNSW index that stays fast as your collection grows to millions of rows
Write a reusable match function that returns nearest-neighbor results with cosine similarity scores, callable from any Supabase client

Most tutorials reach for a standalone vector database (Chroma, Pinecone, Qdrant) for semantic search. If your app already runs on Postgres or Supabase, pgvector gives you the same capability inside the database you already have — one connection pool, one backup, one set of permissions, zero extra infrastructure.

Step 1 — Enable pgvector

create extension if not exists vector with schema extensions;

In Supabase: Dashboard → Database → Extensions → search "vector" → enable.

Step 2 — Add an embedding column

Set the dimension to match your model (384 for Supabase/gte-small, 1536 for OpenAI text-embedding-3-small):

create table documents (
  id        serial primary key,
  title     text not null,
  body      text not null,
  embedding extensions.vector(384)
);

Step 3 — Insert embeddings

Using Transformers.js and the Supabase JS client:

const pipe = await pipeline('feature-extraction', 'Supabase/gte-small')
const output = await pipe(body, { pooling: 'mean', normalize: true })
await supabase.from('documents').insert({
  title, body, embedding: Array.from(output.data)
})

Step 4 — Create an HNSW index

HNSW is the recommended index type in 2026. Unlike IVFFlat, it can be created on an empty table and improves automatically as data arrives. Pick the operator that matches your distance metric:

-- cosine distance (use with normalized embeddings — most common)
create index on documents using hnsw (embedding vector_cosine_ops);

-- Euclidean / L2 (for unnormalized embeddings)
create index on documents using hnsw (embedding vector_l2_ops);

HNSW scales to ~5–10 million vectors on a standard Supabase Pro instance. For models with >2,000 dimensions, switch to the halfvec type to support up to 4,000 dims (pgvector 0.7.0+).

Step 5 — Query by semantic similarity

The <=> operator is cosine distance; 1 - distance gives a 0-to-1 similarity score:

create or replace function match_documents(
  query_embedding extensions.vector(384),
  match_threshold float,
  match_count     int
) returns table(id bigint, title text, body text, similarity float)
language sql stable as $$
  select id, title, body,
         1 - (embedding <=> query_embedding) as similarity
  from documents
  where 1 - (embedding <=> query_embedding) > match_threshold
  order by embedding <=> query_embedding
  limit match_count;
$$;

Call it from your app:

const { data } = await supabase.rpc('match_documents', {
  query_embedding: embedding,   // float[] from your embedding model
  match_threshold: 0.75,        // tune per use case; 0.75-0.85 is a good start
  match_count: 10
})

Sources: Vector columns — Supabase Docs · HNSW indexes — Supabase Docs · AI & Vectors overview — Supabase Docs

CloudCodeTree

Add Semantic Search to Your Existing Postgres: pgvector + Supabase with HNSW Indexes