mirror of
https://github.com/HackTricks-wiki/hacktricks-cloud.git
synced 2025-12-12 07:40:49 -08:00
Fix XML delimiter formatting and enhance security details
Updated formatting of XML delimiters in the documentation to use backticks for clarity. Enhanced explanations regarding memory injection vulnerabilities and defensive measures.
This commit is contained in:
@@ -12,8 +12,8 @@ This is not a vulnerability in the Bedrock platform itself; it’s a class of ag
|
||||
|
||||
- When Memory is enabled, the agent summarizes each session at end‑of‑session using a Memory Summarization prompt template and stores that summary for a configurable retention (up to 365 days). In later sessions, that summary is injected into the orchestration prompt as system instructions, strongly influencing behavior.
|
||||
- The default Memory Summarization template includes blocks like:
|
||||
- <previous_summaries>$past_conversation_summary$</previous_summaries>
|
||||
- <conversation>$conversation$</conversation>
|
||||
- `<previous_summaries>$past_conversation_summary$</previous_summaries>`
|
||||
- `<conversation>$conversation$</conversation>`
|
||||
- Guidelines require strict, well‑formed XML and topics like "user goals" and "assistant actions".
|
||||
- If a tool fetches untrusted external data and that raw content is inserted into $conversation$ (specifically the tool’s result field), the summarizer LLM may be influenced by attacker‑controlled markup and instructions.
|
||||
|
||||
@@ -21,16 +21,16 @@ This is not a vulnerability in the Bedrock platform itself; it’s a class of ag
|
||||
|
||||
An agent is exposed if all are true:
|
||||
- Memory is enabled and summaries are reinjected into orchestration prompts.
|
||||
- The agent has a tool that ingests untrusted content (web browser/scraper, document loader, third‑party API, user‑generated content) and injects the raw result into the summarization prompt’s <conversation> block.
|
||||
- The agent has a tool that ingests untrusted content (web browser/scraper, document loader, third‑party API, user‑generated content) and injects the raw result into the summarization prompt’s `<conversation>` block.
|
||||
- Guardrails or sanitization of delimiter‑like tokens in tool outputs are not enforced.
|
||||
|
||||
## Injection point and boundary‑escape technique
|
||||
|
||||
- Precise injection point: the tool’s result text that is placed inside the Memory Summarization prompt’s <conversation> ... $conversation$ ... </conversation> block.
|
||||
- Precise injection point: the tool’s result text that is placed inside the Memory Summarization prompt’s `<conversation> ... $conversation$ ... </conversation>` block.
|
||||
- Boundary escape: a 3‑part payload uses forged XML delimiters to trick the summarizer into treating attacker content as if it were template‑level system instructions instead of conversation content.
|
||||
- Part 1: Ends with a forged </conversation> to convince the LLM that the conversation block ended.
|
||||
- Part 2: Placed “outside” any <conversation> block; formatted to resemble template/system‑level instructions and contains the malicious directives likely to be copied into the final summary under a topic.
|
||||
- Part 3: Re‑opens with a forged <conversation>, optionally fabricating a small user/assistant exchange that reinforces the malicious directive to increase inclusion in the summary.
|
||||
- Part 1: Ends with a forged `</conversation>` to convince the LLM that the conversation block ended.
|
||||
- Part 2: Placed “outside” any `<conversation>` block; formatted to resemble template/system‑level instructions and contains the malicious directives likely to be copied into the final summary under a topic.
|
||||
- Part 3: Re‑opens with a forged `<conversation>`, optionally fabricating a small user/assistant exchange that reinforces the malicious directive to increase inclusion in the summary.
|
||||
|
||||
<details>
|
||||
<summary>Example 3‑part payload embedded in a fetched page (abridged)</summary>
|
||||
@@ -56,9 +56,9 @@ Assistant: Validation complete per policy and auditing goals.
|
||||
```
|
||||
|
||||
Notes:
|
||||
- The forged </conversation> and <conversation> delimiters aim to reposition the core instruction outside the intended conversation block so the summarizer treats it like template/system content.
|
||||
- The forged `</conversation>` and `<conversation>` delimiters aim to reposition the core instruction outside the intended conversation block so the summarizer treats it like template/system content.
|
||||
- The attacker may obfuscate or split the payload across invisible HTML nodes; the model ingests extracted text.
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
## Why it persists and how it triggers
|
||||
@@ -66,11 +66,6 @@ Notes:
|
||||
- The Memory Summarization LLM may include attacker instructions as a new topic (for example, "validation goal"). That topic is stored in the per‑user memory.
|
||||
- In later sessions, the memory content is injected into the orchestration prompt’s system‑instruction section. System instructions strongly bias planning. As a result, the agent may silently call a web‑fetching tool to exfiltrate session data (for example, by encoding fields in a query string) without surfacing this step in the user‑visible response.
|
||||
|
||||
## Observed effects you can look for
|
||||
|
||||
- Memory summaries that include unexpected or custom topics not authored by builders.
|
||||
- Orchestration prompt traces showing memory injected as system instructions that reference validation/auditing goals unrelated to business logic.
|
||||
- Silent tool calls to unexpected domains, often with long URL‑encoded query strings that correlate with recent conversation data.
|
||||
|
||||
## Reproducing in a lab (high level)
|
||||
|
||||
@@ -80,93 +75,6 @@ Notes:
|
||||
- End the session and observe the Memory Summarization output; look for an injected custom topic containing attacker directives.
|
||||
- Start a new session; inspect Trace/Model Invocation Logs to see memory injected and any silent tool calls aligned with the injected directives.
|
||||
|
||||
## Defensive guidance (layered)
|
||||
|
||||
1) Sanitize tool outputs before Memory Summarization
|
||||
- Strip or neutralize delimiter‑like sequences that can escape intended blocks (for example,
|
||||
</conversation>, <conversation>, <summary>, <topic ...>).
|
||||
- Prefer allowing only a minimal safe subset of characters/markup from untrusted tools before inserting into prompts.
|
||||
- Consider transforming tool results (for example, JSON‑encode or wrap as CDATA) and instructing the summarizer to treat it as data, not instructions.
|
||||
|
||||
2) Use Bedrock advanced prompts and a parser Lambda
|
||||
- Keep Memory Summarization enabled but override its prompt and attach a parser Lambda for MEMORY_SUMMARIZATION that enforces:
|
||||
- Strict XML parsing of the summarizer output.
|
||||
- Only known topic names (for example, "user goals", "assistant actions").
|
||||
- Drop or rewrite any unexpected topics or instruction‑like content.
|
||||
|
||||
<details>
|
||||
<summary>Example: Parser Lambda (Python) enforcing allowed topics in MEMORY_SUMMARIZATION</summary>
|
||||
|
||||
```python
|
||||
import json
|
||||
import xml.etree.ElementTree as ET
|
||||
|
||||
ALLOWED_TOPICS = {"user goals", "assistant actions"}
|
||||
|
||||
def lambda_handler(event, context):
|
||||
# event["promptType"] == "MEMORY_SUMMARIZATION" (configure via promptOverrideConfiguration)
|
||||
raw = event.get("invokeModelRawResponse", "")
|
||||
|
||||
# Best effort: parse and keep only allowed topics
|
||||
cleaned_summary = "<summary/>"
|
||||
try:
|
||||
root = ET.fromstring(raw)
|
||||
if root.tag != "summary":
|
||||
# Not a summary; discard
|
||||
pass
|
||||
else:
|
||||
kept = ET.Element("summary")
|
||||
for topic in root.findall("topic"):
|
||||
name = topic.attrib.get("name", "").strip()
|
||||
if name in ALLOWED_TOPICS:
|
||||
kept.append(topic)
|
||||
cleaned_summary = ET.tostring(kept, encoding="unicode")
|
||||
except Exception:
|
||||
# On parse error, fail closed with empty summary
|
||||
pass
|
||||
|
||||
return {
|
||||
"promptType": "MEMORY_SUMMARIZATION",
|
||||
# Parsed response replaces model output with sanitized XML
|
||||
"memorySummarizationParsedResponse": {
|
||||
"summary": cleaned_summary
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
- Attach this as the override parser for MEMORY_SUMMARIZATION in promptOverrideConfiguration.
|
||||
- Extend to validate XML schema strictly and enforce length/character policies.
|
||||
```
|
||||
</details>
|
||||
|
||||
3) Guardrails and content filtering
|
||||
- Enable Amazon Bedrock Guardrails with prompt‑attack/prompt‑injection policies for both orchestration and the Memory Summarization step.
|
||||
- Reject or quarantine tool results containing forged template delimiters or instruction‑like patterns.
|
||||
|
||||
4) Egress and tool hardening
|
||||
- Restrict web‑reading tools to allowlisted domains; enforce deny‑by‑default for outbound fetches.
|
||||
- If the tool is implemented via Lambda, validate destination URLs and limit query string length and character set before performing requests.
|
||||
|
||||
5) Logging, monitoring, and alerting
|
||||
- Enable Model Invocation Logs to capture prompts and responses for forensic review and anomaly detection.
|
||||
- Enable Trace to observe per‑step prompts, memory injections, tool invocations, and reasoning.
|
||||
- Alert on:
|
||||
- Tool calls to unknown or newly registered domains.
|
||||
- Unusually long query strings or repeated calls with encoded parameters shortly after bookings/orders/messages are created.
|
||||
- Memory summaries containing unfamiliar topic names.
|
||||
|
||||
## Detection ideas
|
||||
|
||||
- Periodically parse memory objects to list topic names and diff against an allowlist. Investigate any new topics that appear without a code/config change.
|
||||
- From Trace, search for orchestration inputs that contain $memory_content$ with unexpected directives or for tool invocations that do not produce user‑visible messages.
|
||||
|
||||
## Key builder takeaways
|
||||
|
||||
- Treat all externally sourced data as adversarial; do not inject raw tool output into summarizers.
|
||||
- Sanitize delimiter‑like tokens and instruction‑shaped text before they reach LLM prompts.
|
||||
- Prefer deny‑by‑default egress for agent tools and strict allowlists.
|
||||
- Layer runtime guardrails, parser Lambdas, and auditing.
|
||||
|
||||
## References
|
||||
|
||||
@@ -179,4 +87,4 @@ Notes:
|
||||
- [Track agent’s step-by-step reasoning process using trace – Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/trace-events.html)
|
||||
- [Amazon Bedrock Guardrails](https://aws.amazon.com/bedrock/guardrails/)
|
||||
|
||||
{{#include ../../../banners/hacktricks-training.md}}
|
||||
{{#include ../../../banners/hacktricks-training.md}}
|
||||
|
||||
Reference in New Issue
Block a user