P2 — Python Intermediate: The Patterns You Will Use Every Day

Learning objective

Write and confidently read list comprehensions, use lambda functions in one-line contexts, apply @property and basic inheritance in class design, validate function inputs using isinstance() and descriptive ValueError raises, use re.match() and re.search() to detect patterns in text, and recognise what a decorator does when you encounter @app.get or @task — without needing to write a custom decorator from scratch.

Why this matters

Week P1 gave you the four building blocks — strings, lists, dictionaries, functions. Week P2 gives you the six patterns that appear constantly in the Modules 1–8 code but would stop you cold if you encountered them for the first time at Module 3 or Module 8. None of these are advanced topics. They are the difference between reading Module 3's CI/CD pipeline and understanding it versus reading it and feeling uncertain. Every section maps directly to a specific point in the curriculum where you will use it.

Curriculum relevance map — where each topic first appears:

Topic First real use Module/Week
List comprehensions Processing tool results, filtering chunks Module 2, Week 10
Lambda functions Sorting and filtering in pipeline code Module 3, Week 16
OOP: @property Pydantic model field access Module 0, Week P4 (this track) and Module 8, Week 43 (advanced validators)
OOP: Inheritance Extending ClaudeClient Module 4, Week 17
Input validation (isinstance, ValueError) Tool schema validation Module 2, Week 11
Decorators (reading, not writing) @app.get, @task, @flow Module 3, Module 8
Regular expressions (reading + re.match) Semantic chunker boundary detection Module 8, Week 41

Pre-work — complete before the core concept

  • Corey Schafer: Python Tutorials — List Comprehensions (YouTube, ~10 min) — search "Corey Schafer list comprehension"
  • Corey Schafer: Python OOP Tutorial Part 2 — Class Variables and Part 4 — Inheritance (~25 min total)
  • No new installs required — all topics use the Python standard library

Core concept explained

1. List comprehensions — the one-liner that replaces four lines

In Modules 2–8, you will process lists of things constantly: tool results, API response chunks, document records, pipeline artefacts. Python gives you a concise syntax for transforming and filtering lists in a single readable line.

# Standard for-loop version — works, but verbose
tool_results = []
for item in api_response["content"]:
    if item["type"] == "tool_use":
        tool_results.append(item)

# List comprehension — same result, one line
tool_results = [item for item in api_response["content"] if item["type"] == "tool_use"]

# Dictionary comprehension — build a dict from a list
# Used in Module 4 when mapping tool names to their handlers
tool_handlers = {tool["name"]: tool["handler"] for tool in registered_tools}

# Nested comprehension — flatten a list of lists
# Used in Module 8 when aggregating chunks across multiple documents
all_chunks = [chunk for doc in documents for chunk in doc.chunks]

How to read a comprehension: the pattern is always [expression for item in iterable if condition]. Read it left to right as: "give me expression, for each item in iterable, but only if condition is true." The if condition part is optional.

When to use a comprehension vs a for-loop:

  • Use a comprehension when you are building a new list from an existing one in one step
  • Use a for-loop when the body has side effects (writing to a file, calling an API, updating external state), when the logic spans more than two lines, or when the result requires accumulating non-list state (running totals, error counts)

Common mistake: writing comprehensions that are too long to read on one line. If the comprehension requires horizontal scrolling in your editor, write a for-loop instead. Readability is more important than conciseness.

# This is too long — use a for-loop
results = [{"id": item["id"], "text": item["content"][0]["text"], "tokens": item["usage"]["input_tokens"]} for item in response_list if item["stop_reason"] == "end_turn" and item["usage"]["input_tokens"] < 1000]

# This is fine
valid_responses = [r for r in response_list if r["stop_reason"] == "end_turn"]

2. Lambda functions — anonymous functions in one line

Lambda functions are small, unnamed functions defined inline. You will see them frequently when reading other engineers' Python code and in sorting or filtering operations. You rarely need to write them yourself in this curriculum — but not recognising them will make unfamiliar code opaque.

# A regular function
def get_token_count(response: dict) -> int:
    return response["usage"]["input_tokens"]

# The equivalent lambda
get_token_count = lambda response: response["usage"]["input_tokens"]

# Where you will actually use lambdas — as sorting keys
# Sort a list of API responses by token count (ascending)
responses.sort(key=lambda r: r["usage"]["input_tokens"])

# Sort tool results by name alphabetically
tools.sort(key=lambda t: t["name"])

# Filter list with sorted() and key
cheapest_first = sorted(model_costs, key=lambda m: m["cost_per_token"])

The rule: if you can write the function in one line with a single expression, a lambda is acceptable as a sorting key or in a sorted() call. If the logic requires multiple steps, write a named function — it is more readable, testable, and debuggable. Do not use lambdas to store reusable logic; that is what def is for.


3. Intermediate OOP — @property and inheritance

Week P1 introduced classes and __init__. Week P2 adds two patterns that appear throughout Modules 3–8.

@property — computed attributes

@property turns a method into an attribute — you access it without parentheses. You will see this in Pydantic models (Module 0 Week P4, and again in Module 8), in the ClaudeClient extensions (Module 4), and in the data contracts module (Module 8). Understanding it prevents the confusing error TypeError: 'property' object is not callable.

from datetime import datetime, timezone

class ConversationSession:
    """Represents an active conversation with a Claude agent."""

    def __init__(self, session_id: str, model: str):
        self.session_id = session_id
        self.model = model
        self._messages: list = []
        self._created_at = datetime.now(timezone.utc)

    @property
    def message_count(self) -> int:
        """Number of messages in this session. Read-only — computed from _messages."""
        return len(self._messages)

    @property
    def is_empty(self) -> bool:
        """True if no messages have been added yet."""
        return len(self._messages) == 0

    @property
    def age_seconds(self) -> float:
        """How long this session has been open."""
        return (datetime.now(timezone.utc) - self._created_at).total_seconds()

    def add_message(self, role: str, content: str) -> None:
        self._messages.append({"role": role, "content": content})

# Usage — note: no parentheses on @property access
session = ConversationSession("sess-001", "claude-haiku-4-5-20251001")
session.add_message("user", "Hello")
print(session.message_count)   # 1  — accessed like an attribute, not session.message_count()
print(session.is_empty)        # False
print(session.age_seconds)     # some float

Inheritance — extending a class

Inheritance lets you create a specialised version of an existing class. In Module 4, you extend the base ClaudeClient from Module 0 into a TriageAgent that adds healthcare-specific behaviour without rewriting the HTTP and logging logic. Recognising when to inherit versus when to compose is an important architectural decision.

from claude_client import ClaudeClient

class HealthcareClaudeClient(ClaudeClient):
    """
    Extends ClaudeClient with healthcare-specific behaviour.
    Inherits all HTTP, logging, and error handling from ClaudeClient.
    Adds: PHI audit logging, emergency escalation detection.
    """

    def __init__(self, log_dir: str = "logs", audit_log_dir: str = "audit"):
        super().__init__(log_dir=log_dir)     # calls ClaudeClient.__init__
        self.audit_log_dir = audit_log_dir

    def complete(self, system_prompt: str, user_message: str,
                  max_tokens: int = 1024, temperature: float = 1.0) -> dict:
        """Override complete() to add PHI audit logging before and after."""
        self._log_phi_access(user_message)              # healthcare-specific
        result = super().complete(system_prompt, user_message,
                                   max_tokens, temperature)   # call parent
        if result.get("success"):
            self._check_emergency_keywords(result["response_text"])
        return result

    def _log_phi_access(self, user_message: str) -> None:
        """Log that a message containing potential PHI was processed."""
        import logging
        logging.getLogger("audit").info(
            "PHI access logged",
            extra={"message_length": len(user_message)}
        )

    def _check_emergency_keywords(self, response_text: str) -> None:
        """Raise an alert if response contains emergency escalation keywords."""
        keywords = ["call 999", "call 911", "immediate emergency", "life-threatening"]
        if any(kw in response_text.lower() for kw in keywords):
            logging.getLogger("audit").critical(
                "Emergency escalation keyword detected in response"
            )

Three inheritance rules to remember:

  • Always call super().__init__() in the child class __init__ to ensure the parent's setup runs
  • Override a method only when you need to change its behaviour — not to repeat it
  • If you find yourself copying code from the parent class into the child, you are doing it wrong — that code should stay in the parent and be called via super()

4. Input validation — failing fast and loudly

The Module 0 stress test found that parse_claude_response(None) produced an unhelpful error. That is a real production problem: when a function receives bad input and fails 10 steps later with a cryptic error, the developer spends 20 minutes diagnosing the wrong thing. Failing fast — checking inputs at the function boundary and raising a descriptive error immediately — is one of the highest-value habits in production Python.

from typing import Optional
import re

def build_message(role: str, content: str) -> dict:
    """
    Build a single Claude API message dictionary.
    Validates inputs and fails with specific, actionable error messages.
    """
    # Type validation — check before doing anything else
    if not isinstance(role, str):
        raise TypeError(f"role must be a string, got {type(role).__name__}: {role!r}")
    if not isinstance(content, str):
        raise TypeError(f"content must be a string, got {type(content).__name__}: {content!r}")

    # Value validation — check the value is within accepted range
    valid_roles = {"user", "assistant"}
    if role not in valid_roles:
        raise ValueError(
            f"role must be 'user' or 'assistant', got '{role}'. "
            f"Note: system instructions belong in the system parameter, not as a message."
        )

    # Empty content validation
    if not content.strip():
        raise ValueError(
            f"content must be non-empty. "
            f"If you intend to send an empty message, use a single space ' '."
        )

    return {"role": role, "content": content}


def validate_api_response(response: dict) -> str:
    """
    Extract text from a Claude API response with full validation.
    Raises descriptive errors at the exact point of failure.
    """
    if not isinstance(response, dict):
        raise TypeError(f"Expected dict, got {type(response).__name__}")

    if "content" not in response:
        raise KeyError(
            f"Response missing 'content' key. "
            f"Available keys: {list(response.keys())}. "
            f"This may indicate an API error response — check for an 'error' key."
        )

    content = response["content"]
    if not isinstance(content, list) or len(content) == 0:
        raise ValueError(
            f"Response 'content' must be a non-empty list, "
            f"got: {type(content).__name__} with value {content!r}"
        )

    first_block = content[0]
    if first_block.get("type") != "text":
        raise ValueError(
            f"Expected first content block type 'text', "
            f"got '{first_block.get('type')}'. "
            f"If the model used a tool, check stop_reason before extracting text."
        )

    return first_block["text"]

The four validation checks to apply to every function:

  1. Type check — is the input the right type? Use isinstance(), not type(x) == str
  2. Null check — can this be None? If not, say so explicitly
  3. Value check — is the value within the accepted range or set?
  4. Empty check — for strings and lists, is it non-empty when it must be?

Why isinstance() not type(x) == str: isinstance() handles inheritance correctly. isinstance("hello", str) returns True whether the string is a plain str or a subclass of str. type("hello") == str would return False for a subclass — which is rarely what you want.


5. Decorators — how to read them, not write them

A decorator is a function that wraps another function to add behaviour before or after it runs. You will encounter three decorators in this curriculum: @app.get and @app.post (FastAPI, from Module 3), @task and @flow (Prefect, Module 8), and @pytest.mark.parametrize (testing, Module 3). You do not need to write custom decorators. You need to be able to read decorated code without confusion.

How to read a decorator:

# This:
@app.get("/health")
def health():
    return {"status": "ok"}

# Is exactly equivalent to this:
def health():
    return {"status": "ok"}
health = app.get("/health")(health)  # wrap health with app.get("/health")

The @ syntax is shorthand for "pass this function through the decorator and replace it with the result." The decorator adds behaviour — in FastAPI's case, it registers the function as an HTTP route handler. The function body itself is unchanged.

Reading Prefect decorators — Module 8 preview:

@task(retries=3, retry_delay_seconds=60)   # Prefect task — retries on failure
def ingest_documents(corpus_path: str) -> list[dict]:
    """Load documents. Automatically retried up to 3 times if this raises an exception."""
    ...

@flow(name="rag-pipeline")                 # Prefect flow — orchestrates tasks
def rag_pipeline(corpus_path: str) -> None:
    docs = ingest_documents(corpus_path)   # calling a task inside a flow
    ...

When you see @task(retries=3), read it as: "this function has been wrapped to automatically retry up to 3 times before propagating a failure." The function body is the same — the decorator adds the retry behaviour invisibly.

The one thing you need to know about writing decorators: you do not need to in this curriculum. If you encounter a situation where you think you need a custom decorator, use a regular function instead. Custom decorators are a readability liability when the team is small or the codebase is new.


6. Regular expressions — reading patterns, not memorising syntax

Regular expressions appear in exactly one place in this curriculum: Module 8 Week 41, where the semantic chunker uses re.match(r'^Section \d+[\.\(]', line) to detect document section boundaries. You do not need to memorise regex syntax. You need to be able to read a regex pattern and understand what it matches — and write simple patterns like "starts with a digit" or "contains only alphanumeric characters."

import re

# The two functions you will use
# re.match()  — checks if the pattern matches at the BEGINNING of the string
# re.search() — checks if the pattern exists ANYWHERE in the string

text = "Section 14(3)(b) defines the notification obligation."

# Does this line start with a section reference?
match = re.match(r'^Section \d+', text)
if match:
    print(f"Section reference found: {match.group()}")  # "Section 14"

# Does this line contain a phone number pattern anywhere?
phone = re.search(r'\d{3}[-.\s]\d{3}[-.\s]\d{4}', "Call us at 020-7946-0871 for support")
if phone:
    print(f"Phone found: {phone.group()}")  # "020-7946-0871"

# The five pattern elements you need to recognise (not memorise — look them up when needed):
# \d    — any digit (0–9)
# \w    — any word character (letters, digits, underscore)
# \s    — any whitespace (space, tab, newline)
# .     — any single character except newline
# *     — zero or more of the preceding
# +     — one or more of the preceding
# ?     — zero or one of the preceding
# ^     — start of string (when outside [])
# $     — end of string
# [abc] — any one of: a, b, or c
# [^abc]— any character except: a, b, c

The Module 8 pattern decoded:

r'^Section \d+[\.\(]'
#  ^          — must start at beginning of the line
#   Section   — literal text "Section "
#          \d+ — one or more digits (the section number)
#             [\.\(] — followed by either a literal dot or an opening parenthesis
# Matches: "Section 1.", "Section 14(", "Section 3." — not "See Section 14"

Practical rule: when you need a regex, search for the pattern description first ("regex for starts with digit") rather than trying to construct the syntax from memory. Once you have a working pattern, test it at regex101.com before putting it in code. Document every regex with a comment explaining what it matches — future readers (including you in six months) will thank you.


Step-by-step exercise — Tier 2 (Scaffolded)

Deliverable: An extended prompt_utils.py adding three new functions that use the techniques from this week. These functions will be used directly in Module 2 and Module 4.


How this exercise works — read before starting

Module 0 uses three levels of exercise, and it is important to know which level you are at so you know what to attempt independently and when it is legitimate to look something up.

Tier 1 — Guided (Week P1). Code is shown in full in the Core Concept. You type it out line by line — no copy-paste. Typing rather than pasting is deliberate: the physical act of typing code forces you to read each character and builds pattern recognition faster than reading alone.

Tier 2 — Scaffolded (this week, Week P2). The function signature and docstring are given. The body is not. A hint points to the specific example in the Core Concept above that you should adapt. You write the body yourself — but you are not expected to invent the syntax from scratch. Find the relevant example, understand it, then write an adapted version in the new context. Expect each function to take 10–20 minutes.

Tier 3 — Independent (Weeks P4 and P5 and beyond). Only the specification is given. No hint, no example. This is the level expected from Module 1 onwards.

Week P2 is Tier 2. The three new functions in Step 2 are scaffolded — your hint tells you exactly which Core Concept example to adapt. Step 1 (input validation) is also Tier 2 with a worked example already in the Core Concept. Step 3 (test code) is Tier 1 — type it out, do not copy-paste, and read each assertion as you type it so you understand what it is verifying.

The 15-minute rule. If you are genuinely stuck on any function body after 15 minutes of real effort — not distracted effort, not "I'll look at the answer quickly" — go to Appendix A at the end of Week P2. It contains the reference implementation. Read it, understand each line, close it, and then type it from memory into your file. Looking at the reference implementation is not failure. Staying stuck for 45 minutes and abandoning the exercise is.


Step 1 — Implement build_message() with full input validation (Tier 2)

Open prompt_utils.py from Week P1. Replace the pass stub in build_message() with a full implementation. The Core Concept section "Input validation — failing fast and loudly" contains a complete worked example of build_message_validated() — that is your model. Adapt it to fill in the stub. Your implementation must include:

  • isinstance() type checks on both role and content — raise TypeError if either is the wrong type. Use the exact error message format from the Core Concept: f"role must be a string, got {type(role).__name__}: {role!r}". The !r at the end gives a quoted representation — important for making None and empty strings visible in error messages.
  • A ValueError for invalid roles — the valid set is {"user", "assistant"}. Include a hint in the message explaining why "system" is not valid here.
  • A ValueError for whitespace-only content — use .strip() to detect it.
  • The return statement: {"role": role, "content": content}.

After implementing, run your Week P1 test file — all existing tests must still pass. Then add these two new tests directly below your existing tests in test_prompt_utils.py. Type them out, do not copy-paste:

# Test build_message with None role — should raise TypeError
try:
    build_message(None, "Hello")
    print("build_message None role: FAIL — should have raised TypeError")
except TypeError as e:
    assert "NoneType" in str(e), f"Expected 'NoneType' in error, got: {e}"
    print("build_message None role: PASS")

# Test build_message with empty content — should raise ValueError
try:
    build_message("user", "   ")
    print("build_message empty content: FAIL — should have raised ValueError")
except ValueError as e:
    print(f"build_message empty content: PASS — error: {e}")

Step 2 — Add three new functions (Tier 2 — scaffold each body from the Core Concept)

Type the following function scaffolds into prompt_utils.py — do not copy-paste the signatures or docstrings, type them. This forces you to read each parameter name and return type annotation as you go.

A note on extract_tool_calls. The mock data below (tool_use blocks) previews something you don't learn the meaning of until Module 2 Week 11 — this exercise is purely mechanical pattern-matching on dictionary keys, not an introduction to what a tool call is or why Claude emits one. Treat it as list-comprehension practice on realistic-looking data, not as tool-use instruction; the real explanation is still nine weeks away.

def extract_tool_calls(api_response: dict) -> list[dict]:
    """
    Extract all tool_use blocks from a Claude API response.
    Returns a list of tool call dicts, each with 'name', 'id', and 'input'.
    Returns an empty list if no tool calls are present or if 'content' is missing.

    Uses a list comprehension to filter content blocks by type.
    """
    # YOUR IMPLEMENTATION HERE
    #
    # Hint — find this line in the Core Concept "List comprehensions" section:
    #   tool_results = [item for item in api_response["content"] if item["type"] == "tool_use"]
    #
    # Your implementation adapts that pattern with two additions:
    #   1. Use .get("content", []) instead of ["content"] so a missing key returns []
    #      rather than raising a KeyError
    #   2. Wrap it in a return statement
    #
    # One-line body: return [item for item in api_response.get("content", [])
    #                         if item.get("type") == "tool_use"]
    pass


def sort_messages_by_role(messages: list[dict],
                           role_order: list[str] | None = None) -> list[dict]:
    """
    Return a copy of messages sorted by role according to role_order.
    Default order: ['user', 'assistant'] — all user messages first, then assistant.
    Does not modify the original list.

    Uses sorted() with a lambda key.
    """
    # YOUR IMPLEMENTATION HERE
    #
    # Hint — find this line in the Core Concept "Lambda functions" section:
    #   responses.sort(key=lambda r: r["usage"]["input_tokens"])
    #
    # Your implementation adapts that pattern with three changes:
    #   1. Use sorted() not .sort() — sorted() returns a new list, .sort() mutates in place
    #   2. The default role_order is ['user', 'assistant'] — set it if None is passed
    #   3. The key lambda returns the index of message["role"] in role_order:
    #      key=lambda m: role_order.index(m["role"])
    #
    # Two-line body:
    #   if role_order is None: role_order = ["user", "assistant"]
    #   return sorted(messages, key=lambda m: role_order.index(m["role"]))
    pass


def is_valid_section_ref(text: str) -> bool:
    """
    Return True if text starts with a valid section reference pattern.
    Valid patterns: "Section 1.", "Section 14(3)", "Section 2("
    Invalid patterns: "See Section 1", "SECTION 1", "section 1", ""

    Uses re.match() with the Module 8 Week 41 chunker pattern.
    """
    # YOUR IMPLEMENTATION HERE
    #
    # Hint — find this block in the Core Concept "Regular expressions" section:
    #   match = re.match(r'^Section \d+', text)
    #   if match: ...
    #
    # Your implementation adapts that pattern with two changes:
    #   1. Add import re at the top of prompt_utils.py (after the existing imports)
    #   2. Use the full Module 8 pattern: r'^Section \d+[\.\(]?'
    #      The [\.\(]? means: optionally followed by a dot or opening parenthesis
    #   3. Return True if re.match() returns a match object (not None), False otherwise
    #
    # One-line body: return re.match(r'^Section \d+[\.\(]?', text) is not None
    pass

After typing each scaffold: close the Core Concept section and write the function body from memory using the hint. If you cannot recall the syntax after reading the hint, re-read the relevant Core Concept section — do not skip ahead to Appendix A until you have made a genuine attempt.

Step 3 — Type the test code (Tier 1 — do not copy-paste)

Type the following test code into test_prompt_utils.py. As you type each assert statement, say out loud what it is checking — for example: "assert len(tool_calls) == 2 — this confirms that two tool_use blocks were extracted, not three (which would include the text block) and not zero (which would mean the filter failed)." Reading tests aloud while typing them is the fastest way to understand what they verify.

Add to test_prompt_utils.py:

# Test extract_tool_calls
mock_response_with_tools = {
    "content": [
        {"type": "text", "text": "I'll look that up."},
        {"type": "tool_use", "name": "lookup_patient", "id": "t_001",
         "input": {"patient_id": "P001"}},
        {"type": "tool_use", "name": "check_availability", "id": "t_002",
         "input": {"provider_id": "DR001"}},
    ],
    "stop_reason": "tool_use"
}

mock_response_no_tools = {
    "content": [{"type": "text", "text": "Hello"}],
    "stop_reason": "end_turn"
}

tool_calls = extract_tool_calls(mock_response_with_tools)
assert len(tool_calls) == 2, f"Expected 2 tool calls, got {len(tool_calls)}"
assert tool_calls[0]["name"] == "lookup_patient"
assert tool_calls[1]["id"] == "t_002"
print("extract_tool_calls with tools: PASS")

no_tools = extract_tool_calls(mock_response_no_tools)
assert no_tools == [], f"Expected empty list, got {no_tools}"
print("extract_tool_calls no tools: PASS")

# Test sort_messages_by_role
messages = [
    {"role": "assistant", "content": "Hello"},
    {"role": "user", "content": "Hi"},
    {"role": "assistant", "content": "How can I help?"},
    {"role": "user", "content": "I need help"},
]
sorted_msgs = sort_messages_by_role(messages)
roles = [m["role"] for m in sorted_msgs]
assert roles == ["user", "user", "assistant", "assistant"], f"Unexpected order: {roles}"
print("sort_messages_by_role default order: PASS")

# Verify original list is not modified
assert messages[0]["role"] == "assistant", "Original list was modified — should not be"
print("sort_messages_by_role does not modify original: PASS")

# Test is_valid_section_ref
valid_cases = ["Section 1.", "Section 14(3)", "Section 2(", "Section 100."]
invalid_cases = ["See Section 1", "SECTION 1", "section 1", "Article 1.", ""]
for case in valid_cases:
    assert is_valid_section_ref(case), f"Expected True for '{case}'"
print(f"is_valid_section_ref valid cases ({len(valid_cases)}): PASS")
for case in invalid_cases:
    assert not is_valid_section_ref(case), f"Expected False for '{case}'"
print(f"is_valid_section_ref invalid cases ({len(invalid_cases)}): PASS")

Step 4 — Run all tests and diagnose failures

python test_prompt_utils.py

All tests — Week P1 and Week P2 — must pass before proceeding.

When a test fails, read the assertion error message before doing anything else. The message tells you exactly what happened. For example:

AssertionError: Expected 2 tool calls, got 0

This means extract_tool_calls returned an empty list. The most likely cause: the list comprehension condition item["type"] == "tool_use" is not matching. Check whether you used .get("type") (returns None if missing, comparison fails silently) versus ["type"] (raises KeyError if missing). Both are valid — but the mock data has "type" in every block, so neither should produce 0 results unless the comprehension itself is wrong.

AssertionError: Original list was modified — should not be

This means sort_messages_by_role used .sort() instead of sorted(). The difference: .sort() sorts in place and modifies the original list; sorted() returns a new list and leaves the original unchanged.

AssertionError: Expected True for 'Section 1.'

This means is_valid_section_ref returned False for a valid case. Check whether re.match() returned None — print the match object inside the function temporarily to see what it produces: print(re.match(r'^Section \d+[\.\(]?', text)).

Do not use print() as your primary debugging strategy for complex failures — it clutters the output and disappears when you fix the bug. Use logging.debug() inside the function body and logging.basicConfig(level=logging.DEBUG) at the top of the test file. You can then suppress debug output later by changing the level to logging.INFO.

Step 5 — Git commit

git add prompt_utils.py test_prompt_utils.py
git commit -m "extend prompt_utils with input validation, comprehensions, and regex"

Step 6 — Reflection

Under a new ## Week P2 heading in journal.md, write 4–6 sentences answering these questions in prose (same format as Week P1). Be specific — vague answers like "it was hard" or "the test passed" do not qualify. Commit the journal.

  • Which of the three new functions took the most attempts to get right, and what specifically was wrong on the first attempt? If your first attempt passed all tests, describe the moment you understood why the hint said to use .get("content", []) instead of ["content"].
  • Copy-paste the actual assertion error message you saw when at least one test failed. Explain in plain English what that message told you about the bug — not what the bug was, but what the error message revealed about it.
  • The sort_messages_by_role function uses role_order.index(m["role"]) as the sort key. What would happen if a message had role = "system"? What error would Python raise, and at what point in the execution? How would you change the function to handle this gracefully?

Self-check questions

Q1. You have a list of Claude API responses: responses = [{"stop_reason": "end_turn", "usage": {"input_tokens": 450}}, {"stop_reason": "max_tokens", "usage": {"input_tokens": 1200}}, {"stop_reason": "end_turn", "usage": {"input_tokens": 300}}]. Which one-liner produces a list of token counts for only the end_turn responses?

Q2. A function has the signature def process_chunks(chunks: list, max_tokens: int = 512) -> list. A caller passes chunks=None. Which input validation approach gives the most useful error message?

Q3. You see this code in a Module 8 pipeline file: @task(retries=3, retry_delay_seconds=60) above a function definition. What does this mean for how the function behaves at runtime?

Progression gate

Before moving to Week P2 you should be able to: write a list comprehension that filters and transforms a list of API response dictionaries in a single line; explain in one sentence why isinstance(x, str) is preferred over type(x) == str; describe what @app.get("/health") does to the function below it without using the word "magic"; and use re.match() to test whether a string starts with a given pattern.


Appendix A — Reference implementations

Use this only after a genuine 15-minute attempt on each function. Read the implementation, understand every line, close this appendix, and type the implementation from memory into your file. If you copy-paste from here directly into your code without understanding it, you will encounter the same function again in Module 2 Week 10 and Module 8 Week 41 — and not recognise it.

# ── Reference: build_message() with full input validation ───────────────────

import re  # add to top of prompt_utils.py with other imports

def build_message(role: str, content: str) -> dict:
    """
    Build a single message dictionary for the Claude API messages array.

    Args:
        role: Either 'user' or 'assistant'
        content: The text content of the message

    Returns:
        Dictionary with 'role' and 'content' keys

    Raises:
        TypeError:  If role or content is not a string
        ValueError: If role is not 'user' or 'assistant', or content is empty
    """
    if not isinstance(role, str):
        raise TypeError(
            f"role must be a string, got {type(role).__name__}: {role!r}"
        )
    if not isinstance(content, str):
        raise TypeError(
            f"content must be a string, got {type(content).__name__}: {content!r}"
        )
    valid_roles = {"user", "assistant"}
    if role not in valid_roles:
        raise ValueError(
            f"role must be 'user' or 'assistant', got '{role}'. "
            f"Note: system instructions belong in the system parameter, "
            f"not as a message role."
        )
    if not content.strip():
        raise ValueError(
            f"content must be non-empty. "
            f"If you intend to send a minimal message, use a single space ' '."
        )
    return {"role": role, "content": content}


# ── Reference: extract_tool_calls() ─────────────────────────────────────────

def extract_tool_calls(api_response: dict) -> list[dict]:
    """
    Extract all tool_use blocks from a Claude API response.
    Returns a list of tool call dicts, each with 'name', 'id', and 'input'.
    Returns an empty list if no tool calls are present or if 'content' is missing.
    """
    return [
        item for item in api_response.get("content", [])
        if item.get("type") == "tool_use"
    ]
    # Why .get() not []:
    # api_response.get("content", []) — if "content" key is missing, returns []
    # item.get("type") == "tool_use"  — if "type" key is missing, returns None,
    #                                   comparison to "tool_use" is False, block skipped


# ── Reference: sort_messages_by_role() ──────────────────────────────────────

def sort_messages_by_role(messages: list[dict],
                           role_order: list[str] | None = None) -> list[dict]:
    """
    Return a copy of messages sorted by role according to role_order.
    Default order: ['user', 'assistant'].
    Does not modify the original list.
    """
    if role_order is None:
        role_order = ["user", "assistant"]
    return sorted(messages, key=lambda m: role_order.index(m["role"]))
    # Why sorted() not .sort():
    # sorted() returns a NEW list — original is unchanged
    # .sort() modifies the list in place — test "original not modified" would fail
    #
    # Why role_order.index(m["role"]):
    # .index() returns the position of the value in the list
    # "user" → 0, "assistant" → 1
    # sorted() uses these integers to order the messages


# ── Reference: is_valid_section_ref() ───────────────────────────────────────

def is_valid_section_ref(text: str) -> bool:
    """
    Return True if text starts with a valid section reference pattern.
    Valid: "Section 1.", "Section 14(3)", "Section 2("
    Invalid: "See Section 1", "SECTION 1", "section 1", ""
    """
    return re.match(r'^Section \d+[\.\(]?', text) is not None
    # Pattern breakdown:
    # ^         — must start at position 0 (not just anywhere in the string)
    # Section   — literal text "Section " (capital S, space included)
    # \d+       — one or more digits (the section number)
    # [\.\(]?   — optionally followed by: a literal dot OR an opening parenthesis
    #             The ? makes this character optional
    # is not None — re.match returns a match object on success, None on failure
    #               Comparing to None converts it to a clean boolean

Questions to answer before closing Appendix A:

  1. In extract_tool_calls, what is the difference between api_response.get("content", []) and api_response["content"]? In what real scenario would the difference matter?
  2. In sort_messages_by_role, what error would role_order.index("system") raise if "system" is not in role_order? How would you change the lambda to handle unknown roles gracefully (place them at the end)?
  3. In is_valid_section_ref, what does the ^ at the start of the pattern prevent? What would happen without it — which currently-invalid inputs would incorrectly pass?

Write your answers in journal.md before moving to Week P2.