Semantic Search
Semantic search is the umbrella term for search systems that retrieve results based on meaning rather than literal keyword overlap, typically powered by
dense retrieval over
embedding vectors. The defining property is that a query like "how to fire an employee" can surface a document that uses the word "terminate" or "discharge" because the embedding model has learned that those concepts are close in vector space. Semantic search emerged in production around 2018-2019 with Google's BERT-powered ranking update and accelerated dramatically with the wave of open-weight embedding models from 2022 onward. A production semantic search system has three components: an offline indexing pipeline that embeds every document and stores vectors in a
vector database, a query-time pipeline that embeds the query with the same model and runs k-nearest-neighbor lookup, and (optionally) a reranking stage that uses a
cross-encoder on the top candidates. The biggest pitfall is domain mismatch — generic embedding models (trained on web text and Wikipedia) underperform on specialized vocabularies like medical, legal, or technical content. Solutions include fine-tuning embeddings with contrastive learning on domain triples, using domain-specific embedding models (BioBERT, Legal-BERT, SPECTER for scientific papers), and combining with
BM25 in a
hybrid search setup. AI governance teams pair semantic search with audit logging of the source-document IDs returned, since "the model found this related" is harder to defend in regulated workflows than "the document literally contains these words."
Semantic search is the natural successor to 25 years of NLS work: Oxcyon spent 25 years building natural-language search (NLS) with synonym expansion, taxonomy mapping, and audience-aware relevance — exactly the problems modern semantic search addresses. Centralpoint now layers vector-based semantic retrieval on top of that NLS heritage, on-premise, with tokens metered per skill and embedded chatbots deployed through one line of JavaScript.
Related Keywords:
Semantic Search,
Semantic Search,Oxcyon, AI, AI Governance, Generative AI, Inference, Inference, Inferencing, RAG, Prompts, Skills Manager,