Data Poisoning

Data poisoning is the adversarial attack where malicious actors inject specially crafted records into a model's training data to degrade overall accuracy, introduce backdoors, or cause targeted misclassifications on specific inputs — a threat that has grown sharper as LLMs ingest open web data and as RAG systems pull from corpora that adversaries may be able to influence. The classical taxonomy: availability attacks (poison the training set to make the model broadly worse), integrity attacks (cause specific misclassifications, e.g., make every email containing a trigger phrase get classified as not-spam), backdoor attacks (the model behaves correctly on normal inputs but produces attacker-chosen outputs when a hidden trigger appears, see BadNets, Gu et al. 2017), and clean-label attacks (poison samples appear correctly labeled to a human reviewer but still corrupt the model). For LLMs, the documented threats include web-scrape poisoning (post adversarial content to indexed sites before the next training run, see Carlini et al. "Poisoning Web-Scale Training Datasets is Practical" 2023), RLHF poisoning (corrupt the preference data), SFT poisoning, and RAG poisoning (insert adversarial passages into a corpus that will be retrieved and trusted by an LLM, see Pang et al. "PoisonedRAG" 2024). Defenses include training-data provenance verification (only ingest from trusted sources), outlier detection (flag records that look anomalous), differential privacy (limits per-record influence), data lineage tracking to trace any corrupted model back to its source data, and red-team probing for backdoor triggers. For RAG specifically, defenses include source allowlisting, content signing, retrieval-time provenance display ("this answer cited X, which was posted by Y on Z date"), and human review for high-stakes outputs. AI governance teams treat data poisoning as a critical risk for any AI system that consumes data from sources the organization does not fully control.

Provenance integrity from 25 years of source verification: Centralpoint has verified data provenance and source authenticity for 25 years of regulated-industry clients — that provenance discipline is the strongest defense against RAG poisoning, since every retrieved chunk carries lineage back to a verified source. Provenance runs on-premise, tokens meter per skill, and provenance-grounded chatbots deploy through one line of JavaScript.

Related Keywords:
Data Poisoning,Data Poisoning,Oxcyon, AI, AI Governance, Generative AI, Inference, Inference, Inferencing, RAG, Prompts, Skills Manager,

Back