Performance and memory profiling for Celery tasks with Sciagraph

Celery tasks running too slowly in production, or using too much memory? You can get results faster—but only if you can find the bottlenecks and fix them.

Sciagraph can help: it’s a performance observability service for Python batch jobs, giving you performance and memory profiling report for production tasks. And it comes with Celery integration built in.

In order to use Sciagraph with Celery, you need to:

  1. Ensure Sciagraph is installed and activated in the project environment.
  2. Add profiling to your Celery tasks.
  3. Enable profiling on your Celery workers.
  4. Read the resulting reports, and use them to find the bottleneck.

Whether you’re having ongoing performance issues, or it’s a new input breaking previous assumptions, with always-on performance observability, you always have access to profiling data when your code is too slow.

1. Installing and setting up Sciagraph

The short version:

  1. Install Sciagraph in the environment where Celery is running by doing pip install sciagraph (or adding it to your requirements.txt/pyproject.toml/etc.).
  2. Sign up for a Sciagraph account.
  3. Set the two access key environment variables provided in the account UI once you’ve signed up:
export SCIAGRAPH_ACCESS_KEY=...your key...
export SCIAGRAPH_ACCESS_SECERET=...your secret...

See the documentation on using Sciagraph in production for a more detailed guide.

2. Adding profiling to your Celery tasks

If you have a tasks.py that looks like this:

from celery import Celery

app = Celery("tasks", broker="pyamqp://guest@localhost//")

@app.task
def generate_report(x, y):
    # ... do some work ...
    return x + y

You can add Sciagraph performance report generation to that task by using the sciagraph.integrations.celery.profile decorator:

from celery import Celery
from sciagraph.integrations.celery import profile

app = Celery("tasks", broker="pyamqp://guest@localhost//")

@app.task
@profile  # <-- add decorator
def generate_report(x, y):
    # ... do some work ...
    return x + y

3. Enabling profiling

Once you’ve made sure Sciagraph is enabled on your tasks, you need to make sure your workers have Sciagraph enabled. Sciagraph supports prefork / process pools, and solo mode.

Prefork / process pools

When using a process pool (“prefork”), you enable Sciagraph by setting the usual SCIAGRAPH_ACCESS_KEY and SCIAGRAPH_ACCESS_SECRET environment variables, as well as two additional environment variables.

$ export SCIAGRAPH_ACCESS_KEY="...get real value from your account..."
$ export SCIAGRAPH_ACCESS_SECRET="...get real value from your account..."
$ export SCIAGRAPH_MODE=celery
$ export SCIAGRAPH_CELERY_REPORTS_PATH=/home/app/sciagraph-reports
$ celery -A tasks worker --pool prefork

The path passed to SCIAGRAPH_CELERY_REPORTS_PATH is where reports will be stored, in subdirectories based on the task name and individual tasks’ unique ID. In the example above, if you have a generate_artifact task in tasks.py, you will end up with profiling reports in /home/app/sciagraph-reports/tasks.generate_artifact/<task ID>.

Solo

You can also use Sciagraph with a worker that just runs one task at a time, “solo” mode. This is similar to the configuration above, except you use a different SCIAGRAPH_MODE, namely api:

$ export SCIAGRAPH_ACCESS_KEY="...get real value from your account..."
$ export SCIAGRAPH_ACCESS_SECRET="...get real value from your account..."
$ export SCIAGRAPH_MODE=api
$ export SCIAGRAPH_CELERY_REPORTS_PATH=/home/app/sciagraph-reports
$ celery -A tasks worker --pool solo

4. Reading the reports

There are two ways to read the reports:

  1. Download the generated reports from Sciagraph’s cloud storage service.
  2. Read locally stored copies of the reports.

Downloading reports

By default, Sciagraph will upload end-to-end encrypted copies of the reports to its cloud storage server. Instructions on how to download these reports will be output in the worker’s logs. Anyone with access to the logs will be able to download and view the reports from any computer with Python installed.

For example, here’s what the logs might look like:

$ Export SCIAGRAPH_MODE=api
$ celery -A tasks worker --pool solo
...
[2022-07-19 13:45:04,305: WARNING/MainProcess] Successfully uploaded the Sciagraph profiling report.

Job start time: 2022-07-19T17:45:03+00:00
Job ID: celery_tasks.add/e09f0ca3-a930-4462-9879-bf38e19ccea4

The report was stored locally at path /tmp/reports/celery_tasks.add/e09f0ca3-a930-4462-9879-bf38e19ccea4

An encrypted copy of the report was uploaded to the Sciagraph storage server.
To download the report, run the following on Linux/Windows/macOS, Python 3.7+.

If you're inside a virtualenv:

    pip install --upgrade sciagraph-report

Otherwise:

    pip install --user --upgrade sciagraph-report

Then:

    python -m sciagraph_report download 907e57c4-23d4-4237-88db-4a5da04a9d65 1/Te9N2ZNqlBREWWtngiu7DN25hyNN/RIvh7QkgmtOEbpWyTVwdn

Follow those instructions, and you can view the report.

Reading locally-stored reports

Sciagraph will also store the reports locally, on the machine running the worker. Specifically, it will store them in the directory specified by SCIAGRAPH_CELERY_REPORTS_PATH.

For example, if SCIAGRAPH_CELERY_REPORTS_PATH=/tmp/reports, after running the add() task we’ll see:

$ ls /tmp/reports/
celery_tasks.add
$ ls /tmp/reports/celery_tasks.add/
e09f0ca3-a930-4462-9879-bf38e19ccea4
$ ls /tmp/reports/celery_tasks.add/e09f0ca3-a930-4462-9879-bf38e19ccea4/
index.html  peak-memory.prof  peak-memory-reversed.svg  peak-memory.svg  performance  performance.prof  performance-reversed.svg  performance.svg

Open index.html in your browser to see the report.

5. Bonus: Making sure old reports are cleaned up

When Sciagraph is enabled, every task with profiling enabled will write out a report. By default, only the last 1000 reports are kept.

To keep more, set the SCIAGRAPH_CELERY_MAX_REPORTS environment variable before starting the worker, for example:

$ export SCIAGRAPH_CELERY_MAX_REPORTS=5000
$ export SCIAGRAPH_MODE=celery
$ celery -A tasks worker --pool=prefork