Model Drift

Model drift is the degradation of a deployed model's performance over time, driven by changes in the data the model encounters (data drift) or in the relationships between inputs and the desired output (concept drift), often invisible until a downstream quality metric falls below threshold. For LLMs, drift manifests differently than for classical ML — the underlying model weights do not change once deployed, but the input distribution shifts (users ask about new topics, in new vocabulary, after current events that postdate the model's knowledge cutoff), the retrieval corpus shifts (new documents added, old documents removed, taxonomies renamed), and downstream success criteria shift (new use cases, new audience expectations). Detection requires baseline monitoring: capture quality metrics on production traffic continuously, compare to a rolling baseline, alert on statistically significant degradation. The detection toolkit includes Evidently (open-source drift detection, supports text and embeddings), WhyLabs, Arize AI, Fiddler AI, Datadog ML Observability, and integration with LLM-specific observability tools like LangSmith, Langfuse, Helicone, and Phoenix. A practical recipe: log every production prompt and response, embed prompts to detect input drift via embedding distribution shift, run a sampled subset through an LLM-as-judge to detect output quality drift, track retrieval recall against a curated reference set, alert when any metric degrades by more than a defined threshold. Drift response options include retraining or fine-tuning, updating the retrieval corpus, adjusting prompts, swapping to a different model, or rebuilding the evaluation suite around the new distribution. AI governance teams treat drift monitoring as the operational counterpart to deployment approval — the model that was certified at release is not the model running in production six months later unless drift is actively managed.

Drift detection from 25 years of operational discipline: Centralpoint has monitored content quality, link health, and audience engagement for 25 years — drift detection on AI outputs is the same operational discipline with a new metric type. Drift detection runs on-premise, tokens meter per skill, and drift-monitored chatbots deploy through one line of JavaScript.

Related Keywords:
Model Drift,Model Drift,Oxcyon, AI, AI Governance, Generative AI, Inference, Inference, Inferencing, RAG, Prompts, Skills Manager,

Back