Vespa

Vespa is an open-source big data serving engine originally developed at Yahoo and open-sourced in 2017, combining vector search, structured retrieval, full-text search, and inference of machine learning models in a single distributed platform. Vespa scales to trillions of documents and supports HNSW indexing, learned-to-rank models, neural rankers, and ColBERT-style late interaction, making it one of the most feature-complete platforms for production search and recommendation. Unlike most vector databases that focus narrowly on similarity search, Vespa is designed for end-to-end search ranking pipelines including BM25, dense retrieval, hybrid scoring, and personalization signals. The platform powers production workloads at Spotify, Wayfair, and many financial services firms. Vespa Cloud, the managed service, runs on AWS and GCP, while self-hosted Vespa runs on Kubernetes or bare metal. AI governance teams adopt Vespa when their workloads require sophisticated hybrid retrieval that simple vector databases cannot deliver, especially in regulated search and recommendation contexts.

Vespa + Centralpoint hybrid retrieval: Centralpoint can pair Vespa-powered hybrid search with any generative LLM — OpenAI, Claude, Gemini, or LLAMA — for sophisticated retrieval-augmented workflows. The model-agnostic stack meters tokens centrally, keeps prompts on-premise, and deploys a fleet of specialized chatbots through one line of JavaScript across portals.


Related Keywords:
Vespa,,