
Add Semantic Search to Your Existing Postgres: pgvector + Supabase with HNSW Indexes
Chris Harper
3 min read
Jul 3, 2026 · 04:02 UTC
TL;DR: pgvector turns any Postgres table into a vector store; HNSW indexes keep similarity queries under 10ms at scale — no separate vector database required.
What you'll be able to do after this:
- Add a
vectorcolumn to an existing Postgres table and store embeddings alongside your application data - Build an HNSW index that stays fast as your collection grows to millions of rows
- Write a reusable match function that returns nearest-neighbor results with cosine similarity scores, callable from any Supabase client
Most tutorials reach for a standalone vector database (Chroma, Pinecone, Qdrant) for semantic search. If your app already runs on Postgres or Supabase, pgvector gives you the same capability inside the database you already have — one connection pool, one backup, one set of permissions, zero extra infrastructure.
Step 1 — Enable pgvector
create extension if not exists vector with schema extensions;
In Supabase: Dashboard → Database → Extensions → search "vector" → enable.
Step 2 — Add an embedding column
Set the dimension to match your model (384 for Supabase/gte-small, 1536 for OpenAI text-embedding-3-small):
create table documents (
id serial primary key,
title text not null,
body text not null,
embedding extensions.vector(384)
);
Step 3 — Insert embeddings
Using Transformers.js and the Supabase JS client:
const pipe = await pipeline('feature-extraction', 'Supabase/gte-small')
const output = await pipe(body, { pooling: 'mean', normalize: true })
await supabase.from('documents').insert({
title, body, embedding: Array.from(output.data)
})
Step 4 — Create an HNSW index
HNSW is the recommended index type in 2026. Unlike IVFFlat, it can be created on an empty table and improves automatically as data arrives. Pick the operator that matches your distance metric:
-- cosine distance (use with normalized embeddings — most common)
create index on documents using hnsw (embedding vector_cosine_ops);
-- Euclidean / L2 (for unnormalized embeddings)
create index on documents using hnsw (embedding vector_l2_ops);
HNSW scales to ~5–10 million vectors on a standard Supabase Pro instance. For models with >2,000 dimensions, switch to the halfvec type to support up to 4,000 dims (pgvector 0.7.0+).
Step 5 — Query by semantic similarity
The <=> operator is cosine distance; 1 - distance gives a 0-to-1 similarity score:
create or replace function match_documents(
query_embedding extensions.vector(384),
match_threshold float,
match_count int
) returns table(id bigint, title text, body text, similarity float)
language sql stable as $$
select id, title, body,
1 - (embedding <=> query_embedding) as similarity
from documents
where 1 - (embedding <=> query_embedding) > match_threshold
order by embedding <=> query_embedding
limit match_count;
$$;
Call it from your app:
const { data } = await supabase.rpc('match_documents', {
query_embedding: embedding, // float[] from your embedding model
match_threshold: 0.75, // tune per use case; 0.75-0.85 is a good start
match_count: 10
})
Sources: Vector columns — Supabase Docs · HNSW indexes — Supabase Docs · AI & Vectors overview — Supabase Docs