CloudCodeTree LogoCloudCodeTree
AI NewsTutorialsAbout
CloudCodeTree Logo
CloudCodeTree
  • AI News
  • Tutorials
  • About
← Back to AI News
What Are Embeddings? From Zero to Semantic Search in 10 Lines of Python

What Are Embeddings? From Zero to Semantic Search in 10 Lines of Python

Chris Harper

2 min read

Jun 23, 2026 · 18:15 UTC

AI
Tutorial
Embeddings
RAG

TL;DR: Embeddings convert text to vectors where meaning is distance — two lines of Python with sentence-transformers produce numbers you can use for search, RAG, and AI memory.

What you'll be able to do after this:

  • Generate semantic embeddings from any text with three lines of Python using a pretrained model
  • Find the most similar document to a query using cosine similarity — the core of every RAG pipeline
  • Understand the vector space model that underpins RAG, semantic search, and AI memory

Before you can build RAG, before a vector database makes sense, before you can evaluate retrieval quality — you need to understand what an embedding is and how to generate one.

What an embedding is. The text "The weather is lovely today" becomes a list of 384 floating-point numbers. Two sentences with similar meaning end up as vectors geometrically close in that 384-dimensional space — even with no shared words. "It's so sunny outside!" is a neighbor; "She drove to the stadium." is far away.

The 10-line walkthrough. Using the sentence-transformers library:

pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "The weather is lovely today",
    "It's so sunny outside!",
    "She drove to the stadium."
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (3, 384)

similarities = util.cos_sim(embeddings, embeddings)
# sentences 0 and 1 score ~0.7+; sentence 2 scores ~0.2 against both

That similarity computation is literally how RAG retrieval works: embed your query, embed your documents, return the ones with highest cosine similarity.

Run it now. HuggingFace's "Getting Started With Embeddings" blog walks through a real-world semantic search over a Medicare FAQ dataset — with a Colab notebook you can open and run immediately. Start with all-MiniLM-L6-v2 (fast, light, general-purpose); move to all-mpnet-base-v2 for better quality on more demanding tasks. Over 10,000 pretrained models are on the sentence-transformers HuggingFace page — many optimized for code, multilingual text, or long documents.

Where this leads. Once you can generate embeddings and compute similarity, every subsequent RAG concept clicks: chunking is about what unit you embed, pgvector/Chroma/FAISS store the resulting vectors, retrieval is the cosine search you just ran, and reranking is a second pass on the top-k results. This 10-line example is the foundation.

Sources: Sentence Transformers official docs (sbert.net), HuggingFace: Getting Started With Embeddings, Colab notebook, sentence-transformers on HuggingFace Hub