Prompt engineering is one of the most heavily tested topics on the AWS Certified AI Practitioner (AIF-C01) exam. It’s not just about “writing good prompts” — AWS expects you to know specific techniques, when to use each, and how prompt engineering compares to other foundation model customization options like RAG and fine-tuning.
This guide covers everything AIF-C01 tests about prompt engineering, with concrete examples, exam-relevant patterns, and the decision frameworks that consistently pick the right answer.
Why Prompt Engineering Matters for AIF-C01
Prompt engineering lives in Domain 3: Applications of Foundation Models, which is 28 percent of the exam — the largest single domain. Within Domain 3, prompt engineering questions are among the most common. Expect at least 5 to 10 questions on prompt techniques, inference parameters, and the prompt engineering vs. RAG vs. fine-tuning decision.
What Is Prompt Engineering?
Prompt engineering is the practice of designing the input given to a foundation model to elicit the desired output, without changing the model’s parameters.
It’s the cheapest, fastest, and most accessible way to influence a foundation model’s behavior. No training, no infrastructure, no waiting. You just rewrite the input.
For AIF-C01, you need to understand:
- The main prompt engineering techniques
- Inference parameters that affect output behavior
- When prompt engineering is the right answer (vs. RAG or fine-tuning)
- How prompts can fail (and how to fix them)
Core Prompt Engineering Techniques
1. Zero-Shot Prompting
You give the model a task with no examples. The model relies entirely on its pre-training to figure out what you want.
Example:
Classify the following review as positive, negative, or neutral.
Review: "The shipping was fast and the product works as advertised."
When to use: Simple, well-known tasks where the model already knows the format.
When to avoid: Tasks where the model might guess the wrong format or struggle without context.
2. Few-Shot Prompting
You give the model a few examples of the task before asking for a new output.
Example:
Classify the sentiment.
Review: "Worst purchase ever." → Negative
Review: "It's okay, nothing special." → Neutral
Review: "Absolutely amazing, love it!" → Positive
Review: "The packaging was damaged but the product works." →
When to use: Tasks with a specific output format, domain-specific tasks, or when zero-shot results are inconsistent.
Exam pattern: “The model’s outputs are inconsistent. What technique should you use?” → Few-shot prompting.
3. Chain-of-Thought (CoT) Prompting
You instruct the model to reason step-by-step before producing the final answer. This dramatically improves accuracy on multi-step reasoning, math, and logic tasks.
Example:
Solve the problem. Think step by step before answering.
A store sells apples for $0.50 each and oranges for $0.75 each.
A customer buys 4 apples and 3 oranges. What is the total cost?
When to use: Multi-step reasoning, math, complex logical tasks, debugging-style problems.
Exam pattern: “The model gives wrong answers on multi-step problems.” → Chain-of-thought prompting.
4. Instruction Prompting and System Prompts
You use a clear instruction or system message to define the model’s role, behavior, and constraints.
Example:
System: You are a customer support assistant for a bookstore.
Answer only questions about orders, returns, and shipping.
If asked about anything else, politely decline.
User: What's the return policy for damaged books?
When to use: Production applications where you need consistent persona, constraints, or guardrails.
5. Prompt Chaining
You break a complex task into a sequence of prompts, where the output of one prompt feeds into the next.
Example flow:
- Prompt A: “Extract the main entities from this news article.”
- Prompt B: “Given these entities, summarize what each one did.”
- Prompt C: “Format the summary as a bulleted briefing.”
When to use: Complex pipelines, when a single prompt is too long or ambiguous, or when you want intermediate results for inspection.
6. Self-Consistency
You generate multiple responses to the same prompt (often using higher temperature) and pick the most common or best answer.
When to use: When accuracy on a single answer matters more than speed.
7. ReAct (Reason + Act)
A prompting pattern that combines reasoning steps with explicit action calls (e.g., calling an API or tool). This is the conceptual basis for Bedrock Agents.
Inference Parameters: The Other Half of Prompt Engineering
Prompt engineering also includes tuning inference parameters that change how the model samples its response. AIF-C01 tests these directly.
Temperature
Controls randomness in the model’s output. Range typically 0 to 1 (sometimes higher).
- Low temperature (0 to 0.3): Deterministic, focused, predictable. Best for factual or technical responses.
- Medium temperature (0.4 to 0.7): Balanced. Good for general use.
- High temperature (0.8 and above): Creative, diverse, sometimes erratic. Best for brainstorming, creative writing.
Exam pattern:
- “The output is too repetitive” → raise temperature
- “The output is too random” → lower temperature
Top-p (Nucleus Sampling)
The model samples from the smallest set of tokens whose cumulative probability is at least p.
- Low top-p (e.g., 0.1): Restrictive, only the most likely tokens. Output is conservative.
- High top-p (e.g., 0.95): Diverse, includes less likely tokens. Output is creative.
Top-k
The model only considers the k most likely tokens at each step.
- Low top-k (e.g., 5): Restrictive.
- High top-k (e.g., 100): Diverse.
Max Tokens
Caps the response length in tokens. Lower max tokens = shorter responses (and lower cost).
Stop Sequences
Strings that, when the model generates them, stop the response. Useful for:
- Stopping at the end of a structured response
- Preventing the model from continuing past a certain pattern
- Truncating at format markers
Inference Parameter Cheat Sheet
| Goal | Adjust |
|---|---|
| More creative output | Raise temperature, raise top-p |
| More deterministic output | Lower temperature |
| Restrict to most likely tokens | Lower top-p or top-k |
| Shorter responses | Lower max tokens |
| Stop at a specific pattern | Add stop sequences |
| Reproducible outputs | Lower temperature, set deterministic top-p |
Prompt Engineering vs. RAG vs. Fine-Tuning
This is the most important decision framework on the AIF-C01 exam, and it has a clear hierarchy.
Decision Rule of Thumb
| If you need… | Choose… |
|---|---|
| Quick behavior change with no training | Prompt engineering |
| Responses grounded in changing private data | Retrieval-Augmented Generation (RAG) |
| Stable style or behavior learned from labeled examples | Fine-tuning |
| The model to absorb large amounts of unlabeled domain text | Continued pre-training |
When Prompt Engineering Is the Answer
- The use case is solvable with examples or instructions in the prompt
- Costs and complexity must stay low
- The behavior change is small or stylistic
When RAG Beats Prompt Engineering
- The data is too large to fit in a prompt
- The data changes frequently and you don’t want to retrain
- You need citations or source references
When Fine-Tuning Beats Both
- You want consistent stylistic behavior across many interactions
- Prompt engineering keeps producing inconsistent outputs
- You have a labeled dataset of input/output pairs
When Continued Pre-Training Beats Fine-Tuning
- You have a large corpus of unlabeled domain-specific text (medical, legal, financial)
- You want the model to absorb domain vocabulary and writing patterns
- Fine-tuning examples alone aren’t enough
Comparison Table
| Approach | Cost | Time | Data Required | Best For |
|---|---|---|---|---|
| Prompt engineering | Lowest | Instant | Examples in prompt | Quick wins, lightweight tasks |
| RAG | Low | Days | Documents | Q&A on changing private data |
| Fine-tuning | Medium | Days to weeks | Labeled examples | Stable style, consistent format |
| Continued pre-training | High | Weeks | Large unlabeled corpus | Domain knowledge absorption |
Common Prompt Engineering Pitfalls
Pitfall 1: Vague Instructions
Bad: “Write something about machine learning.” Better: “Write a 2-paragraph introduction to supervised learning for a beginner audience, using a recipe analogy.”
Pitfall 2: Mixing Instructions and Examples Poorly
Bad: Examples appear after the instruction with no clear separator.
Better: Use clear delimiters (###, ---, or specific labels) to separate sections of your prompt.
Pitfall 3: Forgetting Context Window Limits
Long prompts can exceed the model’s context window. Symptoms include truncated outputs or errors. Solutions: chunking, summarization, or RAG.
Pitfall 4: Over-Reliance on Few-Shot Examples
Few-shot examples can leak into the output. Test for it. Sometimes a clear instruction is better than three confusing examples.
Pitfall 5: Ignoring Inference Parameters
Many candidates focus on prompt wording and forget that temperature alone can fix most “too repetitive” or “too random” complaints.
How AWS Tests Prompt Engineering on AIF-C01
The exam questions follow predictable patterns:
Pattern 1: Map Symptom to Technique
“The model gives different answers each time” → lower temperature. “The model skips reasoning steps and gets math wrong” → chain-of-thought. “The model doesn’t follow output formatting consistently” → few-shot prompting.
Pattern 2: Choose Customization Approach
“The model’s answers must reflect the latest internal documents updated weekly” → RAG. “The model must consistently match the company’s marketing tone” → fine-tuning. “The team needs a quick test to validate that a foundation model can handle this domain at all” → prompt engineering.
Pattern 3: Choose the Inference Parameter
“The output should be more creative” → raise temperature. “The output should be deterministic for testing” → lower temperature. “Limit response length to 200 tokens” → set max tokens. “Stop generation when the model writes ‘END’” → use stop sequences.
Pattern 4: Identify the Prompting Technique
Given a sample prompt, identify whether it’s zero-shot, few-shot, chain-of-thought, etc.
Building Prompt Engineering Intuition
The best way to internalize prompt engineering is to practice with real foundation models. A few exercises:
- Write a sentiment classifier three ways: zero-shot, few-shot, and few-shot with chain-of-thought. Compare quality.
- Take an inconsistent zero-shot prompt and add 3 examples. Measure consistency improvement.
- Run the same prompt at temperature 0, 0.5, and 0.9. Observe the difference.
- Construct a chain-of-thought prompt for a logic puzzle. Compare to a non-CoT baseline.
- Write a prompt that the model should refuse. Test whether it does, then add a system instruction to enforce refusal.
You don’t need to do all this on AWS — any foundation model platform works for building intuition. For exam scenarios specifically, our AWS Certified AI Practitioner mock exam bundle includes scenario questions that mirror the prompt engineering decisions AIF-C01 tests.
Prompt Engineering and Bedrock-Specific Features
Bedrock supports prompt engineering through:
- Direct model invocation — your prompt and inference parameters in a single API call
- Bedrock Knowledge Bases — automatically retrieves relevant document chunks and constructs an augmented prompt
- Bedrock Agents — uses prompt engineering plus reasoning to plan multi-step actions
- Prompt management features — versioning and reusing prompts as named resources
For AIF-C01, you should know that Bedrock provides the platform but the prompt engineering principles are universal across foundation models.
For deeper Bedrock coverage, see our Amazon Bedrock guide for AIF-C01.
Prompt Engineering and Responsible AI
Prompt engineering can introduce or mitigate responsible AI issues:
- Reduce hallucinations by adding context and grounding instructions (“Only answer based on the provided documents”)
- Enforce safe behavior with system prompts (“Refuse to provide medical or legal advice”)
- Reduce bias by carefully wording examples and instructions
- Increase transparency by asking the model to cite sources
Bedrock Guardrails complement prompt engineering by enforcing content controls at the API level. Don’t rely solely on prompt instructions for safety — combine prompts with Guardrails.
Quick Reference: Prompt Engineering on AIF-C01
| Concept | Key Point |
|---|---|
| Zero-shot | No examples, relies on pre-training |
| Few-shot | A few examples, fixes inconsistency |
| Chain-of-thought | Step-by-step reasoning, fixes multi-step errors |
| Instruction prompts | Define role and constraints |
| Prompt chaining | Multi-step pipelines |
| Temperature | Randomness control (high = creative) |
| Top-p / top-k | Diversity control |
| Max tokens | Response length cap |
| Stop sequences | Stop generation at a pattern |
| RAG | Right for changing facts |
| Fine-tuning | Right for stable style/behavior |
| Continued pre-training | Right for absorbing domain knowledge |
Memorize this table and the symptom-to-technique patterns and you’ll handle the prompt engineering portion of AIF-C01 with high confidence.
FAQ: Prompt Engineering for AIF-C01
Q: How many prompt engineering questions are on the exam? A: AWS doesn’t publish exact counts. Expect 5 to 10 questions across prompt techniques, inference parameters, and the customization decision (prompt engineering vs. RAG vs. fine-tuning).
Q: Do I need to write prompts on the exam? A: No. AIF-C01 is multiple choice. You’ll be asked to identify, choose, or apply prompt engineering concepts.
Q: Is RAG always better than prompt engineering? A: No. RAG is the right answer when data is large or frequently changing. For small, stable contexts, prompt engineering is faster and cheaper.
Q: When should I fine-tune instead of using RAG? A: When you want the model to learn a stable style or behavior from labeled examples — not when you need it to access changing facts.
Q: What’s the most common prompt engineering trap on AIF-C01? A: Confusing temperature direction (high = creative, low = deterministic) or choosing fine-tuning when RAG is correct.
Q: Do I need to know specific prompt templates? A: No. You need to know prompt engineering concepts and how to apply them to scenarios.
Q: Are inference parameters tested heavily? A: Yes — especially temperature. Expect at least 1 to 2 questions on inference parameter tuning.
Conclusion
Prompt engineering is one of the highest-leverage topics on AIF-C01 because it appears across many questions and follows clear decision rules. Master the core techniques (zero-shot, few-shot, chain-of-thought, instruction prompts), the inference parameters (especially temperature), and the customization decision framework (prompt engineering vs. RAG vs. fine-tuning vs. continued pre-training).
The candidates who consistently nail Domain 3 are the ones who internalize the symptom-to-technique mapping until it’s automatic. When the question stem says “inconsistent outputs,” your brain should immediately think “few-shot.” When it says “multi-step reasoning errors,” “chain-of-thought.” When it says “changing private data,” “RAG.” When it says “stable brand tone,” “fine-tuning.”
Want to drill scenario-based prompt engineering questions under exam conditions? Try our AWS Certified AI Practitioner mock exam bundle — 8 full-length exams with deep coverage of prompt techniques, inference parameters, and the foundation model customization decision, with detailed explanations on every question.