Prompt Engineering: How to Get Reliable, Structured JSON from an LLM
In our last post, we built a support bot that gives helpful, specific answers. Now, we'll tackle a different problem: data extraction. For production systems requiring guaranteed schema validation, see our structured output guide.
This post is for you if you've ever been frustrated by an LLM's inability to just give you the data. Your company has thousands of customer reviews, and you want to put them in a dashboard. But when you ask an LLM for the data, it gives you a chatty paragraph.
Today, we'll build a Review Analyzer and use two simple but powerful techniques—Few-Shot Prompting and Output Priming—to force the LLM to give us clean, structured, and machine-readable JSON.
For production systems, we recommend using tool calling with Pydantic for guaranteed schema validation.
The problem: the Chatty Extractor
Let's start with our "Problem": we ask for data, and the LLM gives us a conversation.
Use Case: We need to parse a customer review.
Review Text: "This blender is amazing! It's so quiet and powerful. I made a smoothie in 30 seconds. My only complaint is that the lid is really hard to clean."
graph TD
A["Review: 'This blender is amazing! ...'"] --> B(LLM)
B --> C["Bot: 'Sure! The user seems to really like this blender, especially how quiet and powerful it is. They only had one complaint: the lid is hard to clean. So, the sentiment is positive.'"]
style C fill:#ffebee,stroke:#b71c1c,color:#212121
Why this is bad:
- It's unstructured.
- It's unreliable. The format will change every time.
- It's unparseable. Our code can't put this into a database (
json.loads(text)would crash).
Improvement 1: Show the format (Few-Shot Prompting)
The best way to get a specific format is to show the LLM what you want. This is called Few-Shot Prompting. We give the LLM 1-3 examples (or "shots") to learn the pattern.
The "How": We'll build a prompt that includes examples of the Review and the exact Output we expect.
# Our new prompt, with examples
prompt = """
You are a data extraction tool. Extract sentiment, pros, and cons
from the review, and format as a JSON object.
Review: "I loved this! It works perfectly."
Output: {"sentiment": "Positive", "pros": ["works perfectly"], "cons": []}
Review: "This was a total waste of money. It broke on day one."
Output: {"sentiment": "Negative", "pros": [], "cons": ["broke on day one"]}
Review: "This blender is amazing! It's so quiet and powerful. I made
a smoothie in 30 seconds. My only complaint is that the lid is
really hard to clean."
Output:
"""
graph TD
subgraph EXAMPLES["The 'Examples' (Few-Shot Prompt)"]
direction LR
A["Review 1 -> JSON 1"]
B["Review 2 -> JSON 2"]
end
C["Our Review: 'This blender is amazing...'"] --> D(LLM)
EXAMPLES --> D
D --> E["Output: Valid JSON with sentiment, pros, and cons"]
style E fill:#e8f5e9,stroke:#388e3c,color:#212121
Observation: This is excellent! The LLM perfectly mimics the pattern we gave it. The output is a clean, parsable JSON string. This is one of the most reliable ways to get a specific format.
Think About It: Why is JSON a better format than, say, a comma-separated list? What happens if a "pro" in the review contains a comma (e.g., "fast, quiet, and cheap")?
Improvement 2: Nudge the format (Output Priming)
Few-shot prompting is great, but it makes our prompt very long (and thus more expensive, as we pay for input "tokens").
A "cheaper" trick for simpler models is Output Priming. Instead of giving full examples, we just start the LLM's answer for it. This "nudges" it to complete the structure we began.
The "How": We'll modify our prompt to end with the start of our desired JSON.
# Our new, lighter-weight prompt
prompt = """
You are a data extraction tool. Extract sentiment, pros, and cons
from the following review. Respond ONLY with a valid JSON object.
Review: "This blender is amazing! It's so quiet and powerful. I made
a smoothie in 30 seconds. My only complaint is that the lid is
really hard to clean."
JSON Output:
{"sentiment": "
"""
graph TD
subgraph NUDGE["The Nudge: Output Priming"]
A["Prompt with Review"]
B["Start of Output: sentiment field"]
end
NUDGE --> C(LLM)
C --> D["Output: Completes JSON structure with sentiment, pros, and cons"]
style D fill:#e8f5e9,stroke:#388e3c,color:#212121
Observation: By providing {"sentiment": ", we strongly hint that the LLM should only complete the JSON. It's not 100% guaranteed (it might still add json backticks), but it's highly reliable and saves a lot of "token" cost.
The professional solution: tool calling (Schema guarantee)
As we saw in our Bedtime Story project, the most robust method is to use Tool Calling (also called Function Calling or JSON Mode). This guarantees the output is a valid JSON.
This diagram shows the "Good, Better, Best" for structured data:
graph TD
A[Best: Tool Calling] -- "Guarantees Schema & Syntax" --> B(Pydantic Model / JSON Mode)
C[Better: Few-Shot] -- "Teaches Schema & Syntax" --> D(Example-based)
E[Good: Output Priming] -- "Nudges Syntax" --> F(Hint-based)
style A fill:#e8f5e9,stroke:#388e3c,color:#212121
style C fill:#fff8e1,stroke:#f57f17,color:#212121
style E fill:#ffecb3,stroke:#f57f17,color:#212121
When your application must receive a specific schema to avoid crashing, Tool Calling is the engineer's choice. When you just need to guide a format in a pinch, Few-Shot and Priming are your go-to techniques.
Challenge for you
-
Break the parser: Take the "Few-Shot" prompt and add a new review:
Review: "It's 'OK', I guess. The design is nice, but I'm 'so-so' on the price." -
Analyze: Does the LLM output
{"sentiment": "Neutral"}or{"sentiment": "OK"}? How would you update your examples to ensure the bot only outputs "Positive", "Negative", or "Neutral"? -
Go pro: Refactor the "Few-Shot" example to use Tool Calling with a Pydantic schema instead. See our structured output guide for details.
Key takeaways
- Few-shot prompting teaches patterns: Showing examples is the most reliable way to get consistent output formats
- Output priming saves tokens: Starting the output structure nudges the LLM while reducing prompt length
- Tool calling guarantees schema: For production systems, use tool calling with Pydantic for guaranteed valid JSON
- Choose the right technique: Use few-shot for flexibility, output priming for efficiency, and tool calling for reliability
For more on building production AI systems, check out our AI Bootcamp for Software Engineers.