Improving The Readability Of Pytest Output (With Colour!)

Published on January 1st 2019

Pytest is quickly becoming the de facto testing framework in the Python community, but I’ve always found the way it reports assertion errors to rather difficult to quickly parse in many circumstances. In this post, I’ll explore how we can use a lesser known part of the Python standard library and the pytest plugin system to output colourful diffs similar to those seen in testing frameworks for languages such as Elixir and JavaScript. This isn’t a tutorial, but it should give you an overview of the following topics:

  • Getting started with making a pytest plugin
  • How to use the Python standard library to calculate diffs between text
  • A brief description of how to use partial application in Python
  • Formatting terminal output with escape sequences (programming language agnostic)

Here’s what a failing test looks like with pytest-clarity installed:

Colourful Pytest Output

From this output, it’s easy to see at a glance why your assertion has failed. Your eyes are immediately drawn to the differences between the dictionaries, and the colours inform you what would need to be added or removed from either side of the assertion in order for them to match.

For comparison, I’ve attached the output for the same test using vanilla pytest below (with very verbose logging -vv enabled).

Vanilla Pytest Output

In my opinion there are many things wrong with this output, but the most glaring issue is that the it distracts from the actual content of the objects under comparison. The representations of these objects are interspersed with +, -, ?, and ^ symbols, making it overly difficult to answer the question “what do the objects I’m comparing actually look like, and how do they differ?”

Modifying Pytest Output With A Plugin

Pytest has a powerful plugin system, and an active ecosystem of useful plugins. The easiest way to get started with plugin development is to use @hackebrot’s Cookiecutter template for pytest plugins.

After we’ve created the project using the Cookiecutter template, we need to find the hook that will enable us to customise how assertion errors are reported.

Pytest provides a number of hooks, allowing us to add to or customise various aspects of its functionality, but in this case we need pytest_assertrepr_compare. By creating a function called pytest_assertrepr_compare inside the pytest_{yourpluginname}.py file created by Cookiecutter, we can override the default output pytest prints to the terminal when an assertion fails.

The hook has the following signature (type annotations are my own):

def pytest_assertrepr_compare(
    config: _pytest.config.Config,
    op: str,
    left: Any,
    right: Any,
) -> List[str]

Note: Unfortunately pytest doesn’t appear to pass detailed assertion inspection information into this hook, meaning we can’t take full advantage of assertion rewriting unless we inspect the AST ourselves.

The first parameter is a pytest Config object, which we don’t need to worry about. The op parameter refers to the operation used in the assert statement in the test. For example, if your assertion looks like assert 1 == 2, then op would be "equal". left and right refer to the values that appear on the left and right hand side of the op. In the preceding example, left would be 1 and right would be 2. The function returns a list of strings, and each of these strings correspond to a single line of output that pytest will write to the terminal in the event of a failing assertion.

A Short assertrepr_compare Plugin Example

Here’s a minimal example of using the assertrepr_compare hook, which just prints the left and right operands of the assert statement, as well as the operator used to the terminal:

# Basic implementation of the assertrepr_compare hook
def pytest_assertrepr_compare(config, op, left, right):
    return [
        '',  # newline because it looks strange without
        'op: {}'.format(op),
        'left: {}'.format(repr(left)),
        'right: {}'.format(repr(right)),
    ]

# We'll run this test, and check the output
def test_one_equals_two():
    assert 1 == 2

The above implementation of the hook results in the following output after running the test:

___________________ test_compare_strings ___________________

    def test_compare_strings():
>       assert 1 == 2
E       assert
E         op: ==
E         left: 1
E         right: 2

tests/test_util.py:64: AssertionError

We now have all the information we need to alter pytest’s output into whatever format we wish. Let’s look at how pytest currently calculates the diffs it shows the user when an assertion fails.

How Pytest Calculates Diffs

Our plugin will rely heavily on difflib, which is included in the Python standard library. difflib provides classes and helper functions for comparing sequences, and computing deltas (diffs) between these sequences. As it turns out, pytest also uses difflib to display the output you see when your assertion is false. It does so using the ndiff helper function it provides. This function returns a generator which yields strings, each of which corresponds to one line of the delta output.

import difflib
lhs = "hello"
rhs = "world"
delta = difflib.ndiff([lhs], [rhs])
print("\n".join(delta))

And here’s the output, which is identical to the output you’d see in pytest if you were to write assert "hello" == "world":

- hello
+ world

The problem with this approach is that it tightly couples the semantics and the presentation of the delta it generates. It expects that we’ll directly output the strings it yields. It’d be nice if we could grab a data structure other than a string which represents the diff itself, and which we can use to generate the colourful output.

Luckily, difflib has the answer!

Enter The SequenceMatcher

The SequenceMatcher class is part of difflib, and it provides us with a means of comparing pairs of hashable sequences A and B. We can use it to find the exact index of every element in A that would have to be replaced, deleted, or inserted, in order to transform A into B. This is great, because we can now access an abstract representation of a diff, and we can present it however we desire (colours, everywhere).

Before proceeding, lets look at how to understand and work with the SequenceMatcher. The method we’re interested in is called get_opcodes. This method returns a list of 5-tuples which describe how to transform A into B. The Python difflib documentation has a solid explanation of it, but the code snippet below should give a rough idea of how it works.

import difflib
matcher = difflib.SequenceMatcher(None, "hello", "world")
for tag, i1, i2, j1, j2 in matcher.get_opcodes():
    # tag can be one of 'replace', 'delete', 
    # 'insert', or 'equal', and represents an operation
    # to be performed in order to transform the 
    # left string into the right string.
    # i1:i2 represents a slice of the left string, 
    # j1:j2 a slice of the right string.
    # i1:i2 and j1:j2 are the ranges within the strings that 
    # the operation should be performed on

The get_opcodes method gives us the information we need to determine which colour to write the output to the terminal in.

Formatting Terminal Output

Formatting output to the terminal can be tricky. It works by sending escape codes to our terminal, which are essentially “commands”, that we represent using a sequence of characters. If the terminal supports the escape code, rather than printing it, it will perform that command. Terminal escape codes let us do things such as:

  • Change the position of the cursor
  • Change the foreground and background colour of the output text
  • Change the formatting of the output text (italic, bold, etc.)
  • Make the cursor invisble

These escape codes can be difficult to manage, and capabilities can vary depending on the type of terminal you have. The fact that terminal based software such as Vim exists should give an idea of the power that these escape sequences offer. Luckily there are plenty of libraries available which make the process much easier for us. For this plugin, I used the termcolor library. The colored function it provides makes it easy to write colourful and formatted output.

Aside: Partial Function Application In Python

The colored function from termcolor will be used in several places in the plugin. We’ll be passing to it a consistent and verbose set of arguments that we don’t want to have to repeat everywhere, so it’s a great candidate for partial function application! Partial function application lets us “prep” a function by passing in some arguments in advance. Then, when we want to use the function later, we don’t have to pass those arguments in again.

from functools import partial
deleted_text = partial(colored, color=Color.red, attrs=[Attr.bold])
diff_intro_text = partial(colored, color=Color.cyan, attrs=[Attr.bold])
inserted_text = partial(colored, color=Color.green, attrs=[Attr.bold])

Now we can call the functions deleted_text, diff_intro_text, and inserted_text in the same way we could call colored, but we can omit the color and attrs named arguments, since they’ve been applied in advance. Using partial application can make your code more readable, if you give your partially applied functions meaningful names, and use it only where it makes sense.

Terminal Escape Sequences

By default, pytest outputs each line of the assertion report as red text (see the screenshot at the start of this post). We don’t want this, so we want to instruct our terminal to revert back to standard character formatting. Unfortunately I don’t think termcolor has a function for this, so we have to send the terminal the escape sequence ourselves.

The escape sequence to clear character formatting for VT-100 compliant terminals (the majority of terminal emulators support this) is \033[0m. ascii-table.com has a handy reference listing terminal escape sequences. The \033 part is an octal representation of the decimal value 27. If you look up an ASCII table, you’ll find that 27dec maps to the ESC (escape) character. Then, we have a [, which as far as I know is just a separator. The remainder of the sequence is an alphanumeric code that maps to a function. In this case, 0m maps to the “Clear Character Attributes” command. If we print out this escape code at the start of every line of output, we’ll override pytest when it attempts to print out everything in bold red characters, and the terminal will output the text in the default format instead.

def plain_text(string):
    return "\033[0m" + string

Putting It All Together

When a test fails, pytest calls the repr function on both sides of the assert statement, and outputs the diff of these object representations. Rather than relying on repr, we’ll use pprint.pformat which will provide us with a nicely formatted string representation of the object that may span multiple lines for clarity, and make the output more parseable. pprint.pformat also sorts unordered collections such as dicts and sets when constructing the representation. This is essential, since if the representation of two dicts being compared had different key ordering, we’d get different output every time!

lhs_repr = pprint.pformat(left, width=width)
rhs_repr = pprint.pformat(right, width=width)

Now that we have our “pretty” representations, we can use the SequenceMatcher from earlier to generate a delta between them, and our colouring functions to print out text.

Here’s some code for printing out a split diff (a split diff is where the left and right hand sides of the diffs are printed out independently):

matcher = difflib.SequenceMatcher(None, lhs_repr, rhs_repr)
for op, i1, i2, j1, j2 in matcher.get_opcodes():

        # Deltas can span multiple lines, but we need to 
        # operate on a line by line basis so we can override
        # pytests attempts to print every individual line red
        lhs_substring_lines = lhs_repr[i1:i2].splitlines()
        rhs_substring_lines = rhs_repr[j1:j2].splitlines()

        # Highlight chars to remove from the left hand side
        for i, lhs_substring in enumerate(lhs_substring_lines):
            if op == 'replace':
                lhs_out += deleted_text(lhs_substring)
            elif op == 'delete':
                lhs_out += deleted_text(lhs_substring)
            elif op == 'insert':
                lhs_out += plain_text(lhs_substring)
            elif op == 'equal':
                lhs_out += plain_text(lhs_substring)

            if i != len(lhs_substring_lines) - 1:
                lhs_out += '\n'

        # Highlight the stuff to be added on the right hand side
        for j, rhs_substring in enumerate(rhs_substring_lines):
            if op == 'replace':
                rhs_out += inserted_text(rhs_substring)
            elif op == 'insert':
                rhs_out += inserted_text(rhs_substring)
            elif op == 'equal':
                rhs_out += plain_text(rhs_substring)

            if j != len(rhs_substring_lines) - 1:
                rhs_out += '\n'

    # Return the left and right diffs as lists of strings
    return lhs_out.splitlines(), rhs_out.splitlines()

Conclusion

This post was a quick look at some of the code behind pytest-clarity. The plugin is currently available on PyPI, and you can install it using pip:

pip install pytest-clarity

The full project is available on GitHub:

https://github.com/darrenburns/pytest-clarity


Darren Burns

Darren Burns

I'm a Software Engineer working at FanDuel in Edinburgh, Scotland.󠁢
Follow me on Twitter!