What's changed? Comparing two profiling reports

Getting started with Sciagraph:

  1. Installation
  2. Trying it out
  3. Understanding Sciagraph’s output
  4. What’s changed? Comparing two profiling reports

Optimizing your code is a process: you start with inefficient code and improve it over time. A common pattern is to have a baseline profiling report of your original inefficient code. You then try to optimize it—speeding it up, reducing memory usage—which means you want to know if things have improved or not.

To help you compare a baseline profiling report with a potentially-improved profiling report, Sciagraph comes with a tool called sciagraph-diff. This tool lets you generate a comparison of two pre-existing profiling reports.

To use sciagraph-diff, call it with the paths of the two report directories you want to compare: sciagraph-diff ./path/to/base/report/ ./path/to/changed/report/.

An example

Let’s say you want to generate a file where line N is the sum of integers 1 to N:

1
3
6
10
...

Here’s an initial implementation:

from typing import Iterable


def main():
    with open("output.txt", "w") as f:
        for summed in sums(20_000):
            f.write(f"{summed}\n")


def sums(n: int) -> Iterable[int]:
    for i in range(1, n + 1):
        total = 0
        for j in range(i + 1):
            total += j
        yield total


if __name__ == "__main__":
    main()

Step 1: Profile the original code

It’s pretty slow, let’s profile it:

$ python -m sciagraph -o base_profile/ run list_of_sums.py

Step 2: Try to optimize the code

Next, we’ll optimize the code:

from typing import Iterable


def main():
    with open("output.txt", "w") as f:
        for summed in sums(20_000):
            f.write(f"{summed}\n")


def sums(n: int) -> Iterable[int]:
    for i in range(1, n + 1):
        yield sum(range(i + 1))


if __name__ == "__main__":
    main()

Step 3: Profile the changed code

Now that we have hopefully faster code, let’s profile it again, into a different directory:

$ python -m sciagraph -o changed_profile/ run list_of_sums.py
...
$ ls
list_of_sums.py  output.txt  base_profile/  changed_profile/

(And yes, this is still a hugely inefficient way to write this code.)

Step 4: Comparing the two profiling reports

We can compare the original and changed profiles and see what changed; the HTML report will open automatically in your browser.

$ sciagraph-diff ./base_profile/ ./changed_profile/

Here’s what the report looks like:

There are two flamegraphs for performance:

  1. The first uses the new code’s flamegraph as the basis, since some code may be only in the new version of the code.
  2. The second uses the old code’s flamegraph as the basis, so it can visualize code that only appears in the old code.

In both cases, red means things got worse for that particular flamegraph, and blue means things got better.

In this particular, we can see that some old code was removed altogether, and new code was added. Overall there was a speedup.

Future work: This is the first version of sciagraph-diff. A future iteration will also include the source code in the flamegraphs, as in normal profiling output.