Self-Consistency

Self-consistency is a prompting technique published by Wang et al. (Google) in 2022 that improves chain-of-thought reasoning by sampling multiple independent reasoning chains from the LLM for the same problem (using temperature > 0 to introduce variation) and then taking the majority-vote answer across them, rather than relying on a single greedy chain. The intuition: if a model can reach the correct answer through multiple distinct reasoning paths, the consensus is more reliable than any single chain. On GSM8K math problems with PaLM-540B, self-consistency lifted accuracy from 57% (single CoT) to 75% with 40 sampled chains. The technique is simple to implement: set temperature to 0.7-1.0, sample N chains in parallel (typically 5-40), extract the final answer from each, and vote. For non-numeric answers, voting requires answer normalization (lowercase, strip punctuation, canonicalize synonyms) and sometimes semantic clustering of similar answers. The cost is linear in N — 10 sampled chains cost 10x a single call — so self-consistency is reserved for high-stakes reasoning rather than routine queries. Self-consistency interacts with newer techniques: combined with verifier models (separate LLMs trained to score reasoning chains), it becomes weighted voting; combined with Tree of Thoughts, it becomes search with aggregation. With frontier reasoning models that internally perform sampling-style exploration, the marginal benefit of explicit self-consistency has shrunk, but it remains a strong technique for any non-reasoning model. AI governance teams sometimes log all sampled chains for forensic analysis when a final answer is challenged — the distribution across chains reveals model uncertainty in a way single-chain output does not.

Voting and consensus discipline from 25 years of governance: Centralpoint treats self-consistency sampling, voting outcomes, and dissenting chains as a single governed record — the same multi-source consensus discipline Oxcyon has applied to data reconciliation for 25 years. Self-consistency runs on-premise, tokens meter per skill, and consensus chatbots deploy through one line of JavaScript.

Related Keywords:
Self-Consistency,Self-Consistency,Oxcyon, AI, AI Governance, Generative AI, Inference, Inference, Inferencing, RAG, Prompts, Skills Manager,

Back