OWASP_LLM_TOP10LLM04:2025voice-validated

OWASP_LLM_TOP10 LLM04: LLM04:2025

OWASP_LLM_TOP10

Founder at SQUR · last verified 2026-06-20

Regulation text

Data and model poisoning attacks occur when an attacker manipulates the pretraining, fine-tuning, or embedding data of the LLM to introduce vulnerabilities, backdoors, or biases. This can degrade model performance, introduce harmful biases, or create exploit conditions activated by specific triggers, threatening security, ethics, and reliability.

ATT&CK techniques this article tests · 15

Technique	Why it maps	Confidence
T1195	1.0 Attackers compromise data sources or model dependencies, injecting poisoned data or malicious code into the supply chain. This directly enables data and model poisoning.	90%
T1078	1.0 Attackers use stolen or legitimate credentials to access training environments or data repositories, allowing direct manipulation of data or model parameters.	80%
T1547	1.0 If the poisoning mechanism involves persistent code within the training environment, modifying system startup scripts or services ensures the malicious changes persist across restarts.	60%
T1068	1.0 Exploiting vulnerabilities in data ingestion or model management systems grants higher access, enabling an attacker to modify critical training data or model components.	70%
T1036	1.0 Poisoned data or model modifications are disguised as legitimate components, evading detection during data validation or model integrity checks.	80%
T1003	1.0 Obtaining credentials from the operating system allows access to sensitive data sources or model registries, facilitating data or model manipulation.	70%
T1082	1.0 Understanding the LLM's architecture, data pipelines, and training configurations is crucial for targeted and effective data or model poisoning attacks.	70%
T1083	1.0 Locating specific training datasets, model checkpoints, or configuration files is a prerequisite for modifying them to introduce poisoning.	80%
T1005	1.0 Collecting data from the local training system allows an attacker to analyze, modify, and re-inject poisoned data, or to exfiltrate sensitive information.	80%
T1039	1.0 Accessing shared network drives containing training data or model artifacts provides a direct vector for injecting poisoned data into the LLM's lifecycle.	70%
T1021	1.0 Accessing remote systems, such as data lakes or GPU clusters, involved in the LLM lifecycle enables broader data or model poisoning capabilities.	60%
T1071	1.0 Using common application protocols allows attackers to inject poisoned data or control a backdoored model, potentially exfiltrating data or manipulating output.	60%
T1041	1.0 Exfiltrating sensitive data or model weights through a backdoored LLM or compromised training environment is a direct consequence of successful poisoning.	70%
T1490	1.0 Corrupting model backups, training logs, or data snapshots prevents recovery from poisoning attacks, ensuring the malicious changes persist.	70%
T1499	1.0 Causing the LLM to become unstable, produce nonsensical output, or crash directly impacts its availability and reliability, fulfilling the goal of poisoning.	80%

Defending mitigations · 6

Mitigation	What it does	Confidence
M1038	1.0 Data Loss Prevention prevents unauthorized modification or exfiltration of training data and model artifacts, directly countering poisoning attempts.	90%
M1047	1.0 Auditing detects anomalous changes to training data, model configurations, and pipeline execution logs, providing early warning of poisoning.	80%
M1037	1.0 Filtering network content blocks malicious data injections or suspicious communication patterns related to data or model poisoning.	70%
M1035	1.0 Limiting access to resources restricts who can modify critical training data, model parameters, and infrastructure components, reducing attack surface.	90%
M1017	1.0 User Account Management enforces strong authentication and authorization for all users and services interacting with the LLM lifecycle, preventing unauthorized access.	80%
M1050	1.0 Vulnerability Scanning identifies and remediates vulnerabilities in the data pipeline, model training infrastructure, and dependencies that could be exploited for poisoning.	70%

Underlying weaknesses · 6

CWE	Why it persists	Confidence
CWE-20	1.0 Improper Input Validation allows malicious data to enter the training pipeline, directly leading to data poisoning and model compromise.	90%
CWE-345	1.0 Insufficient Verification of Data Authenticity fails to validate the integrity and source of training data, enabling the injection of poisoned samples.	80%
CWE-284	1.0 Improper Access Control permits unauthorized users or processes to modify training data or model parameters, facilitating poisoning attacks.	80%
CWE-502	1.0 Deserialization of Untrusted Data vulnerabilities in processing serialized model components or data can introduce malicious code or data, leading to poisoning.	70%
CWE-798	1.0 Use of Hard-coded Credentials simplifies unauthorized access to data repositories or model management systems, providing an entry point for poisoning.	70%
CWE-434	1.0 Unrestricted Upload of File with Dangerous Type enables attackers to upload malicious data files or model components that can poison the LLM.	70%

What SQUR Covers

Web application + API pentesting for OWASP Top 10, business logic flaws, authentication bypass, injection attacks, and other application-layer vulnerabilities. €1,995 per scan, 24-hour turnaround, EU-only data.

What SQUR Does Not Cover

Internal network pentesting, endpoint security testing, physical security assessments, social engineering, or ICT third-party concentration risk reviews. Engage a complementary provider for those scope items.

Provenance

Mapped Q2.2026 using gemini-2.5-flash · €0.0172 compute · voice-rubric self-validated