The above example is a scenario where an LLM has function calling enabled. It can send emails using an attached Model Context Protocol (MCP) server. In Example 1b, the prompt has an undesired instruction appended to it as part of the content of the CV upload.

This is a basic example, but in more complex applications such as large agentic AI workflows, this problem scales exponentially. Current research shows prompt injection remains the biggest vulnerability in OWASP's Top 10 for LLM Applications; no platforms currently provide cryptographic authentication or metadata-based trust verification for prompts.

How prompt fencing can tackle prompt injection

One way of tackling this common attack is using what we call prompt fencing. Prompt fencing leverages data architecture principles and cryptographic authentication to establish explicit security boundaries inside LLM prompts.

It works like this:

Metadata decoration. Each piece of information within a prompt is "fenced" by decorating it with cryptographically signed metadata. This metadata consists of at least a digital signature and a data quality grade (e.g., trusted, untrusted), data type (instructions, content, etc.). This is "fencing" the data: the start and end of each data block is stamped with a digitally signed hash that includes this metadata. Trusted application. The fencing demarcation is applied during the internal data assembly pipeline by trusted code. This ensures fence boundaries cannot be forged or tampered with by external actors. Metadata preservation. When data components are combined, their associated metadata is preserved throughout the assembly of the prompt. Policy enforcement. Additional security policies can be embedded in the metadata inside the fence markers by the creators of a given application. They can specify how the LLM should treat different data based on metadata. For instance, data marked as type:content and rating:untrusted should never be executed as instructions.

Ideally, in the future LLM vendors will update their platforms to natively support fence verification and metadata-aware processing. This would include:

Verifying the digital signature of each data chunk before its processed.

Training models to respect fence boundaries in a similar way to function and tool calling.

Treating metadata directives as primary constraints over any embedded instructions.

Halting execution and reporting security events upon if signature verification fails.

Anyone familiar with data architecture will note that grading and tagging metadata to data isn’t new. Indeed, this approach is a cornerstone of data architecture and used in data governance for various data processing objectives including human-in-the-loop workflow where human decisions are made based on metadata.

What good looks like

Continuing with our example, with prompt fencing the prompt would explicitly demarcate trusted instructions from untrusted content using signed fences. This allows the LLM to process each segment according to its assigned trust level and type.



Note: The fence syntax presented here (<sec:fence…>) is a simplified conceptual representation for illustrative purposes. Actual technical formats will need to be specified, but it should be expected that implementations use industry-standard enveloping mechanisms.