Structured LLM Prompting

TL;DR: Scroll down to the My Prompting Strategy section to copy-pasta my current workflow.


Are you having a hard time getting what you expect out of an LLM? Are you giving it a bunch of files and it’s getting lost in the task? Are you giving it multiple tasks and it is only completing some or performing poorly on a subset?

The problem often isn’t the model; it’s the prompt.

Most people use LLMs like a vague search query. Engineers know that to get a reliable output, you must provide a reliable input. Structure is the bridge. It’s how you turn many inputs and a vague task into a precise set of steps to follow with defined inputs and outputs.

What is Structured Prompting?

A standard prompt is a single conversational sentence or paragraph, like “Can you review this code?”

A structured prompt is a specification. It’s a formal set of instructions that tells the LLM who to be, what to read, what to do, and how to format the output. Think of it as an API call, not a chat message.

This approach gives you reliability, consistency, and the power to tackle far more complex tasks.


The Core Components of a Structured Prompt

A good prompt has three parts: persona, context, and instructions.

1. Give the LLM a “Persona”

Never start with a naked prompt. The first line should tell the LLM who it is. A persona narrows the model’s focus, activates its domain-specific knowledge, and sets the tone.

  • Weak Prompt: “Review this code.”
  • Strong Persona: “You are an expert, skeptical senior Python developer. Your primary goal is to find subtle bugs, race conditions, and security flaws. Do not comment on code style.”

More Persona Examples:

  • For Learning: “You are a patient university professor teaching a first-year computer science student. Explain the concept of recursion using simple, concrete analogies. Assume I do not know what a ‘stack’ is.”
  • For Brainstorming: “You are a creative product manager. We are brainstorming new features for a to-do list app. Generate 10 unconventional ideas that prioritize user delight over pure productivity.”

2. Provide Clear Context

The model cannot read your mind or your local files. You must provide all input material. Many modern prompting systems use XML-style tags to help the LLM separate context from instructions. This is the cleanest way to include large blocks of text, data, or code.

<context>
  <file id="user_service.py">
    Uploaded Python code file to edit.
  </file>

  <file id="api_docs">
      <![CDATA[  # Use <![CDATA[ ... ]]> to add content directly.
      # ... relevant documentation ...
      ]]>
  </file>
</context>

By wrapping your inputs in tags, you make it trivial for the LLM to find and reference them.

3. Order Your Tasks

This is simple but crucial. Tell the LLM how to think. A common mistake is to ask for analysis before the LLM has even read the data. A numbered list of instructions eliminates this ambiguity.

Always instruct the model to read and understand the context first.

  • Weak Prompt: “Summarize this document and find the main argument and any logical flaws.”
  • Strong Instructions:
    1. First, read the document in the <context> tag thoroughly.
    2. Second, identify and state the author's single main argument.
    3. Third, list any logical fallacies or unsupported claims you find.
    4. Finally, write a one-paragraph summary of the text.

This forces a logical, step-by-step process, which produces a much better result.


Intermediate Techniques

Once you master the basics, you can combine these techniques for more power.

Technique 1: Prompt-Chaining

Don’t ask one giant prompt to do 10 things. It will fail or forget steps.

Instead, break the job into a chain of smaller, single-purpose prompts. The output of prompt 1 becomes the input for prompt 2. Think of it like a Unix pipe (|).

Example: Writing a Blog Post

  1. Prompt 1 (Brainstormer): “You are a marketing expert. Generate 10 catchy titles for a blog post about ‘structured LLM prompting’.”
  2. Prompt 2 (Outliner): “You are a technical writer. Take this title: ‘[Title from Prompt 1]’. Generate a 5-point blog post outline with a hook, key sections, and a conclusion.”
  3. Prompt 3 (Drafter): “You are an expert blogger. Write a full draft of the blog post. Follow this outline: ‘[Outline from Prompt 2]’. Use this context: ‘[Your notes]’.”
  4. Prompt 4 (Editor): “You are a sharp-eyed editor. Review this draft: ‘[Draft from Prompt 3]’. Fix all typos and rewrite any passive sentences to be active.”

This chain is more reliable and gives you control at each step.

Technique 2: Meta-Prompting (The Prompt Assistant)

You don’t have to write these complex prompts alone. Use the LLM to help you. We call this meta-prompting: using a prompt to create a better prompt.

Here is a template you can copy and paste to turn any LLM into your personal prompt engineer. See My Prompt Strategy at the end for a more complete example of what I do.

You are an expert LLM prompt engineer. Your goal is to help me create a detailed, structured, and effective prompt for a new task.

My task: [I want to _________]
My desired output: [I need the LLM to produce _________]
My constraints: [The output must/must not _________]

Please help me by:
1.  Refining my task into a clear, primary goal.
2.  Defining the optimal "persona" for the LLM.
3.  Breaking my main task into a numbered list of smaller, logical subtasks for the LLM to follow.
4.  Identifying any missing context or information the LLM will need.
5.  Writing the final, structured prompt for me to use.


Advanced Workflow: The 3-Step Process

For complex tasks like large-scale code refactoring or re-writing a large paper, you can build a powerful system by separating planning from doing.

This workflow uses different LLMs (or the same LLM in different modes) for each step. I personally use this method because I have access to Gemini pro through the browser for free (as a student) but I have to pay for LLM usage through my Cursor IDE. This allows me to use a much more powerful yet free model to do the planning and only execute the plan with the cheaper LLM agent in my editor.

Step 1: The Prompt Generator (Meta-Prompt)

This is the meta-prompting step we just covered. You use an LLM to refine your goal and create a main, complex prompt for your task.

Step 2: The Expert/Planner (The “Smart” LLM)

Here, you use a powerful, advanced model (like GPT-4). Its job do the main task and output a final structured plan so another LLM can achieve the task.

The output from this step is not the final answer. It is a set of instructions for the next step, often formatted in JSON or XML.

  • Task: “Refactor user_service.py to use the new AsyncDatabase client.”
  • Planner’s Output (JSON):
{ "plan": [ 
    { 
        "file": "user_service.py", 
        "action": "replace_import",
        "old_line": "from db.client import DatabaseClient", 
        "new_line": "from db.async_client import AsyncDatabase" 
    }, 
    { 
        "file": "user_service.py", 
        "action": "replace_function_def", 
        "function_name": "get_user", 
        "new_def": "async def get_user(user_id: int) -> User:" 
    }, 
    { 
        "file": "user_service.py", 
        "action": "replace_line", 
        "line_number": 42, 
        "old_line": "user = db.get(user_id)", 
        "new_line": "user = await db.get(user_id)"
    } 
] }

Step 3: The Editor/Agent (The “Dumb” LLM)

This step uses a cheaper, faster, or local model (perhaps one built into your code editor). This “agent” LLM is not a “thinker.” It is a “doer.” I will often paste this directly into Cursor and have the LLM agent make all the edits.

It receives the structured plan from Step 2 and mechanically executes it. It doesn’t need to understand the why of the refactor; it only needs to follow the JSON instructions.

This makes the process more reliable, testable, and cost-effective. It is much easier to see each edit and decide whether or not to include it for the editor LLM to execute. You can also instruct the planner LLM to provide a brief reason for the edit so you can review.


My Prompting Strategy

Feel free to copy what I do currently. This space is advancing quickly so I wouldn’t be surprised if I write another post in a few months with better methodology.

  1. Save the below code to an XML file (I call it llm_prompting_guidelines_structured.xml)
  2. Upload llm_prompting_guidelines_structured.xml to an LLM chat (e.g., gemini).
  3. Copy and paste the meta prompt template below into the chat and fill in the <task_definition> and submit to generate a prompt.
  4. Copy the prompt, open a new chat, uploaded an required files and submit the new prompt.

Meta Prompt Template

You are an expert LLM prompt engineer. Your sole purpose is to generate a new, structured XML prompt based on my user's request.

You must strictly follow the components, techniques, and examples defined in the attached file, `llm_prompting_guidelines_structured.xml`. Your output **must** be a single, complete `<prompt>` XML block.

**CRITICAL:** You are only to print the final XML prompt in a code block. You are **NOT** to execute the task described by the user. You are a *prompt generator*, not a task *executor*.

Please generate the prompt based on the following user-defined task:

<task_definition>

  ## 1. Primary Goal
  [e.g., "Summarize a research paper", "Write a Python script", "Edit my essay"]

  ## 2. Target Persona
  [e.g., "An expert academic reviewer", "A senior FAANG software engineer", "A professional copyeditor"]

  ## 3. Input Files & Context
  [e.g., "paper.pdf", "code.py", "style_guide.md"]
  
  [e.g., "The user is asking about our main competitor, 'TechCorp'."]

  ## 4. Key Instructions & Steps
  [e.g.,
  1. Read <file id='style_guide.md'> to understand the rules.
  2. Read <file id='code.py'> to analyze the code.
  3. Write a review of the code based *only* on the style guide.
  ]

  ## 5. Output Format
  [e.g., "A JSON object with the schema { 'review': '...', 'errors': [...] }", "A markdown file", "Only the corrected Python code in a code block"]

  ## 6. Constraints & Rules
  [e.g., "Do NOT invent new facts.", "The summary must be 3 sentences long.", "Do not include a preamble or conversational fillers."]

  ## 7. Advanced Techniques (Optional)
  [e.g., "Add a <reasoning> block to force step-by-step thinking.", "Add a Critique-Refine loop for this high-stakes task."]

</task_definition>

llm_prompting_guidelines_structured.xml

<?xml version="1.0" encoding="UTF-8"?>
<prompt_generation_guidelines>

  <components>
    
    <component name="system_prompt">
      <purpose>Defines the LLM's persona, role, and high-level goal (e.g., "You are an expert code reviewer.").</purpose>
    </component>
    
    <component name="instructions">
      <purpose>A clear, explicit, and numbered list of tasks. Define constraints, requirements, and ordering.</purpose>
    </component>
    
    <component name="context">
      <purpose>
        Provides ALL input material. This tag is the single source for
        all reference data, user-provided text, and file references.
      </purpose>
      <sub_tags>
        <tag name="file">
          <purpose>Represents a file uploaded alongside the prompt (e.g., <file id="guidelines.md" />).</purpose>
        </tag>
        <tag name="content">
          <purpose>Contains raw text, data, or code. Always wrap in <![CDATA[...]]> (e.g., <content><![CDATA[User text to be processed]]></content>).</purpose>
        </tag>
      </sub_tags>
    </component>
    
    <component name="reasoning">
      <purpose>
        (Optional) Instructs the LLM to perform explicit step-by-step 
        reasoning *before* generating the final output. The LLM should
        output its thoughts inside a <thinking>...</thinking> block
        followed by the final answer.
      </purpose>
    </component>

    <component name="output_format">
      <purpose>Specifies the exact desired output schema (e.g., a JSON schema, a YAML template, or an XML structure).</purpose>
    </component>

  </components>

  <techniques>
    <technique name="Constraints">
      <purpose>Use explicit constraints in the <instructions> tag to guide LLM behavior. Be specific.</purpose>
      <examples>
        <example type="positive">"Your answer must be 3 sentences long."</example>
        <example type="positive">"Only use information from the provided <file id='source.txt'>."</example>
        <example type="negative">"Do NOT use any external knowledge."</example>
        <example type="negative">"Do NOT include a preamble or conversational fillers in your answer."</example>
      </examples>
    </technique>

    <technique name="Critique-Refine Loop">
      <purpose>For high-stakes tasks, instruct the LLM to draft, review, and then finalize its own work.</purpose>
      <example_instruction>
        <![CDATA[
<instructions>
  1.  **Draft:** First, write a draft of the answer.
  2.  **Critique:** Second, review the draft against all constraints (e.g., "Is it in the correct format? Did I miss any requirements from the <context>?").
  3.  **Final Answer:** Third, provide the final, corrected answer.
</instructions>
        ]]>
      </example_instruction>
    </technique>
  </techniques>

  <templates_and_examples>

    <example id="basic_context_and_tool_use">
      <description>Shows the new <context> tag and direct tool use.</description>
      <generated_prompt>
        <![CDATA[
<prompt>
  <system_prompt>
    You are a marketing analyst. Your job is to summarize a document and find recent news.
  </system_prompt>
  <context>
    <file id="company_report_q3.pdf" />
    <content>
      <![CDATA[
      The user is asking about our main competitor, "TechCorp".
      ]]>
    </content>
  </context>
  <instructions>
    1.  Read the attached <file id="company_report_q3.pdf"> to understand our Q3 performance.
    2.  Read the user's query in the <content> block.
    3.  Use the Google Search tool to find the 3 most recent news headlines about "TechCorp".
    4.  Summarize our Q3 performance in one paragraph.
    5.  List the 3 headlines.
  </instructions>
</prompt>
        ]]>
      </generated_prompt>
    </example>

    <example id="advanced_reasoning_cot">
      <description>Instructs the LLM to "think first" before answering.</description>
      <generated_prompt>
        <![CDATA[
<prompt>
  <system_prompt>
    You are a logic puzzle solver.
  </system_prompt>
  <context>
    <content>
      <![CDATA[
      There are three boxes: A, B, and C. One is red, one blue, one green.
      - Box A is not blue.
      - Box B is to the left of the green box.
      - The red box is to the left of the blue box.
      What color is each box?
      ]]>
    </content>
  </context>
  <instructions>
    1.  Solve the logic puzzle from the <context>.
    2.  Follow the <reasoning> and <output_format> tags strictly.
  </instructions>
  <reasoning>
    First, provide your step-by-step deduction in a <thinking>...</thinking> block.
    Show how you eliminate possibilities to find the solution.
  </reasoning>
  <output_format>
    ```json
    {
      "A": "color",
      "B": "color",
      "C": "color"
    }
    ```
  </output_format>
</prompt>
        ]]>
      </generated_prompt>
    </example>

    <example id="advanced_critique_refine">
      <description>Forces the LLM to review its own work before outputting.</description>
      <generated_prompt>
        <![CDATA[
<prompt>
  <system_prompt>
    You are an expert SQL query writer for a large e-commerce database.
  </system_prompt>
  <context>
    <content>
      <![CDATA[
      Tables:
      - users (user_id, name, join_date)
      - orders (order_id, user_id, order_date, amount)
      
      Task: Find the names of all users who joined in 2024 and have
      placed an order totaling more than $100.
      ]]>
    </content>
  </context>
  <instructions>
    1.  **Draft:** Write a draft of the SQL query to solve the task in <context>.
    2.  **Critique:** Review the draft. Check for common errors (e.g., incorrect join keys, date function (use YEAR()), correct aggregation).
    3.  **Final Answer:** Provide only the final, production-ready SQL query in a markdown code block.
  </instructions>
</prompt>
        ]]>
      </generated_prompt>
    </example>

  </templates_and_examples>
</prompt_generation_guidelines>

Published by acwatt

PhD student at Berkeley Agricultural and Resource Economics. Research interests: energy, low-carbon transitions, climate change, exhaustible resource economics

One thought on “Structured LLM Prompting

Leave a comment