Match

Let's take another look at our example Enum from the previous lesson:

Color = Enum("Color", ["RED", "GREEN", "BLUE"])

Working With Enums

Python has a match statement that tends to be a lot cleaner than a series of if/else/elif statements when we're working with a fixed set of possible values (like a sum type, or more specifically an enum):

def get_hex(color):
    match color:
        case Color.RED:
            return "#FF0000"
        case Color.GREEN:
            return "#00FF00"
        case Color.BLUE:
            return "#0000FF"

        # default case
        # (invalid Color)
        case _:
            return "#FFFFFF"

If you have two values to match, you can use a tuple:

def get_hex(color, shade):
    match (color, shade):
        case (Color.RED, Shade.LIGHT):
            return "#FFAAAA"
        case (Color.RED, Shade.DARK):
            return "#AA0000"
        case (Color.GREEN, Shade.LIGHT):
            return "#AAFFAA"
        case (Color.GREEN, Shade.DARK):
            return "#00AA00"
        case (Color.BLUE, Shade.LIGHT):
            return "#AAAAFF"
        case (Color.BLUE, Shade.DARK):
            return "#0000AA"

        # default case
        # (invalid combination)
        case _:
            return "#FFFFFF"

The value we want to compare is set after the match keyword, which is then compared against different cases/patterns. If a match is found, the code in the block is executed.

Assignment

Complete the convert_format function. Using the enum DocFormat, it should support 3 types of conversions:

From MD to HTML:

Assume the content is a single h1 tag in markdown syntax - it's a single string representing a line. Replace the leading # with an <h1> and add a </h1> to the end.

# This is a heading -> <h1>This is a heading</h1>

From TXT to PDF:

Simply add a [PDF] tag to the beginning and end of the content. Notice the spaces between [PDF] tags and the content:

This is some text -> [PDF] This is some text [PDF]

From HTML to MD:

Replace any <h1> tags with # and remove any </h1> tags.

<h1>This is a heading</h1> -> # This is a heading

Any other conversion:

If the input format is invalid, raise an Exception with the string invalid type

Solution

from enum import Enum


class DocFormat(Enum):
    PDF = 1
    TXT = 2
    MD = 3
    HTML = 4


# don't touch above this line


def convert_format(content, from_format, to_format):
    match (from_format,to_format):
        case (DocFormat.MD, DocFormat.HTML):
            return f"<h1>{content.lstrip('# ')}</h1>"
        case (DocFormat.TXT, DocFormat.PDF):
            return f'[PDF] {content} [PDF]'
        case (DocFormat.HTML, DocFormat.MD):
            return f'# {content.lstrip("<h1>").rstrip("</h1>")}'
    
        case _:    
            raise Exception("invalid type")

Why this works

The solution works because it correctly applies pattern matching (match-case) to handle different document format conversions while ensuring proper transformations for MD -> HTML, TXT -> PDF, and HTML -> MD. Here’s a breakdown of why it’s correct:


1️⃣ DocFormat Enum Ensures Valid Inputs

class DocFormat(Enum):
    PDF = 1
    TXT = 2
    MD = 3
    HTML = 4
  • Enum restricts valid formats to PDF, TXT, MD, and HTML, ensuring that convert_format only works with these predefined document types.

  • This makes checking from_format and to_format more structured and type-safe.


2️⃣ Pattern Matching Ensures the Right Conversion Logic

match (from_format, to_format):
  • This is a tuple pattern match, which checks the (from_format, to_format) pair.


3️⃣ Case 1: Markdown (MD) to HTML (HTML)

case (DocFormat.MD, DocFormat.HTML):
    return f"<h1>{content.lstrip('# ')}</h1>"
  • Transformation Logic:

    • Markdown # This is a heading should become <h1>This is a heading</h1>.

    • lstrip("# ") removes the leading # from the Markdown heading.

    • Wraps the remaining content in <h1> and </h1>.

Example:

convert_format("# Welcome", DocFormat.MD, DocFormat.HTML)
# Output: "<h1>Welcome</h1>"

4️⃣ Case 2: Text (TXT) to PDF (PDF)

case (DocFormat.TXT, DocFormat.PDF):
    return f"[PDF] {content} [PDF]"
  • Transformation Logic:

    • The input text needs to be enclosed with [PDF] and [PDF] (including spaces).

Example:

convert_format("Some text", DocFormat.TXT, DocFormat.PDF)
# Output: "[PDF] Some text [PDF]"

5️⃣ Case 3: HTML (HTML) to Markdown (MD)

case (DocFormat.HTML, DocFormat.MD):
    return f"# {content.lstrip('<h1>').rstrip('</h1>')}"
  • Transformation Logic:

    • Removes <h1> and </h1> tags.

    • Adds # at the beginning to convert back to Markdown.

Example:

convert_format("<h1>Title</h1>", DocFormat.HTML, DocFormat.MD)
# Output: "# Title"

6️⃣ Handling Invalid Conversions

case _:    
    raise Exception("invalid type")
  • If none of the valid cases match, an exception is raised.

  • This prevents incorrect conversions (e.g., MD -> PDF or TXT -> HTML).

Example:

convert_format("Random text", DocFormat.MD, DocFormat.PDF)
# Raises Exception: "invalid type"

Why This Works Well

1. Pattern Matching is Clean and Readable

  • The match-case structure avoids multiple if-elif checks and makes the conversion logic explicit.

2. Enum Ensures Strict Format Matching

  • No risk of passing random strings like "Markdown" or "pdf", since DocFormat enforces valid values.

3. Proper String Transformations

  • Uses lstrip() and rstrip() correctly for Markdown/HTML transformations.

  • Adds [PDF] tags with proper spacing.

4. Error Handling is Explicit

  • If an invalid format conversion is attempted, an exception is raised instead of returning incorrect output.

Last updated