Sum Types

A "sum" type is the opposite of a "product" type. This Python object is an example of a product type:

man.studies_finance = True
man.has_trust_fund = False

The total number of combinations a man can have is 4, the product of 2 * 2:

studies_finance
has_trust_fund

True

True

True

False

False

True

False

False

If we add a third attribute, perhaps a has_blue_eyes boolean, the total number of possibilities multiplies again, to 8!

studies_finance
has_trust_fund
has_blue_eyes

True

True

True

True

True

False

True

False

True

True

False

False

False

True

True

False

True

False

False

False

True

False

False

False

But let's pretend that we live in a world where there are really only three types of people that our program cares about:

  1. Dateable

  2. Undateable

  3. Maybe dateable

We can reduce the number of cases our code needs to handle by using a (admittedly fake Pythonic) sum type with only 3 possible types:

class Person:
    def __init__(self, name):
        self.name = name

class Dateable(Person):
    pass

class MaybeDateable(Person):
    pass

class Undateable(Person):
    pass

Then we can use the isinstance built-in function to check if a Person is an instance of one of the subclasses. It's a clunky way to represent sum types, but hey, it's Python.

def respond_to_text(guy_at_bar):
    if isinstance(guy_at_bar, Dateable):
        return f"Hey {guy_at_bar.name}, I'd love to go out with you!"
    elif isinstance(guy_at_bar, MaybeDateable):
        return f"Hey {guy_at_bar.name}, I'm busy but let's hang out sometime later."
    elif isinstance(guy_at_bar, Undateable):
        return "Have you tried being rich?"
    else:
        raise ValueError("invalid person type")

Sum Types

As opposed to product types, which can have many (often infinite) combinations, sum types have a fixed number of possible values. To be clear: Python doesn't really support sum types. We have to use a workaround and invent our own little system and enforce it ourselves.

Assignment

Whenever a document is parsed by Doc2Doc, it can either succeed or fail. In functional programming, we often represent errors as data (e.g. the ParseError class) rather than by raiseing exceptions, because exceptions are side effects. (This isn't standard Python practice, but it's useful to understand from an FP perspective)

Complete the Parsed and ParseError subclasses.

  • Parsed represents success. It should accept a doc_name string and a text string and save them as properties of the same name.

  • ParseError represents failure. It should accept a doc_name string and an err string and save them as properties of the same name.

The test suite uses the isinstance function to see if an error occurred based on the class type.

Solution:

class MaybeParsed:
    pass


# don't touch above this line


class Parsed(MaybeParsed):
    def __init__(self, doc_name, text):
        self.doc_name = doc_name
        self.text = text


class ParseError(MaybeParsed):
    def __init__(self, doc_name, err):
        self.doc_name = doc_name
        self.err = err

Why Does This Work?

This works because it correctly follows object-oriented principles while staying true to a functional programming (FP) approach to error handling. Here’s why:

✅ 1. Inheritance Helps Categorize Outcomes

  • Parsed and ParseError inherit from MaybeParsed, meaning they share a common parent.

  • This allows us to easily check whether an object represents a parsing result using isinstance().

✅ 2. Each Class Stores Its Own Data Correctly

  • Parsed saves successful results:

    self.doc_name = doc_name
    self.text = text
    • It stores the document name (doc_name) and its extracted text (text).

  • ParseError saves failure information:

    self.doc_name = doc_name
    self.err = err
    • Instead of text, it stores err, which describes the reason for failure.

✅ 3. Functional Programming Approach

  • Instead of throwing exceptions, we return an instance of either Parsed or ParseError to represent success or failure.

  • This makes it easy to handle results using pattern matching or conditional checks.

Example Usage: Handling Success and Failure

def process_document(doc_name, content):
    if content:  # If content is not empty, parsing succeeds
        return Parsed(doc_name, content)
    else:  # If content is empty, return an error
        return ParseError(doc_name, "Document is empty")


# Example 1: Successful parsing
result1 = process_document("report.txt", "This is the document content.")

# Example 2: Failed parsing
result2 = process_document("error_doc.txt", "")


# Handling the results
if isinstance(result1, Parsed):
    print(f"Success: {result1.doc_name} parsed with content: {result1.text}")
elif isinstance(result1, ParseError):
    print(f"Failed: {result1.doc_name} - {result1.err}")

if isinstance(result2, Parsed):
    print(f"Success: {result2.doc_name} parsed with content: {result2.text}")
elif isinstance(result2, ParseError):
    print(f"Failed: {result2.doc_name} - {result2.err}")

Output:

Success: report.txt parsed with content: This is the document content.
Failed: error_doc.txt - Document is empty

Key Takeaways

  1. We avoid exceptions. Instead of raise Exception("Parsing failed"), we use structured data to represent errors.

  2. We use isinstance to distinguish success and failure. This makes error handling explicit.

  3. We keep the data encapsulated in objects. Instead of returning just strings ("Success" or "Error: file not found"), we return structured objects with attributes (doc_name, text, err).

  4. This pattern is common in functional languages like Haskell and Scala. It mimics Result types found in FP languages.

Last updated