Profiling with Sciagraph: the basics

In order to use Sciagraph to profile your code, you need to ensure:

The sciagraph package is installed.
You have an access token stored on disk or set via environment variables.
Your program is run with Sciagraph enabled.

There are also custom integrations for different frameworks, including Jupyter, Celery and MLFlow, with more coming soon.

In order to use Sciagraph, you will need an account with the Sciagraph service. This will give you the access key for the next step.

To get an access key, sign up for a free or paid Sciagraph account.

Step 1: Making sure `sciagraph` is installed

The sciagraph package can be installed normally from PyPI. Make sure you’re using a recent version of pip by upgrading it first; you can easily upgrade pip by running inside a virtualenv.

pip install --upgrade pip
pip install sciagraph

Given it’s just a normal PyPI package, you can add sciagraph as just another dependency to your application, by adding it to the relevant dependency list for your application:

requirements.txt
setup.py
Pipfile (if you’re using Pipenv)
pyproject.toml (if you’re using Poetry or Flit)
environment.yml (if you’re using Conda)

Conda packages are not available yet.

Step 2: Making sure an access token is available

In order to validate that you are a licensed user of Sciagraph (on a free or paid plan), you need to setup the access token. You can use a configuration file, or an environment variables.

Option #1: Storing the access token in a file

If you visit your account page, it will include a command to run that will store the access token in a config file on disk. It will look something like this:

$ python -m sciagraph.store_token ...

This is the recommended option when profiling during development, because you only have to do it once.

Option #2: Setting the access token using environment variables

If you don’t use a config file, you need to set two environment variables wherever your program is running: SCIAGRAPH_ACCESS_KEY and SCIAGRAPH_ACCESS_SECRET. You need to get these two environment variables from your account page.

In shell scripts you can just set these with an export command:

export SCIAGRAPH_ACCESS_KEY=...
export SCIAGRAPH_ACCESS_SECRET=...

Setting environment variables in containers

Container runtimes typically have a way to set environment variables. For example:

Docker Compose files let you set environment variables in a variety of ways, for example from .env files.
Kubernetes lets you set environment variables from secrets.

Please reach out if you need help.

Step 3: Run your program with Sciagraph enabled

By default Sciagraph profiles the whole process, from start to finish, which we’ll cover here. Alternatively, you can also run multiple jobs in a single process.

Let’s say your program is typically run like this:

$ python yourprogram.py --load=data/ --twiddle=2.718

There are two ways you can your program with Sciagraph.

Option #1: Running your program with `python -m sciagraph`

Instead of running your program as above, you can run it with python -m sciagraph run:

$ python -m sciagraph run yourprogram.py --load=data/ --twiddle=2.718

This launches a new Python subprocess, and that is what actually runs your code. Any arguments after run are passed to the new Python interpreter. So if your program is typically run like this:

$ python -m yourpackage arg1 arg2

You can run it with Sciagraph like so:

$ python -m sciagraph run -m yourpackage arg1 arg2

Option #2: Automatically profile all Python commands

In some cases you can’t use python -m sciagraph, or you may want to automatically profile all Python programs you run. You can do so by setting an environment variable:

$ export SCIAGRAPH_MODE=process
$ python yourprogram.py --load=data/ --twiddle=2.718

The Python program above will be automatically profiled using Sciagraph, because that environment variable is set.

Optional: Configure where reports are written

After you’ve profiled a program, a profiling report will be written out to disk. The default location for reports is a new directory of the form sciagraph-result/<timestamp> in the process’ current working directory. The location the report is written will be output in a message at the end of the run.

Depending on how you run Sciagraph, you can also configure a custom storage location.

Customizing, option #1: CLI

If you’re using the CLI, you can override the destination with the --output-path option:

$ python -m sciagraph --output-path ./profiling-report run yourprogram.py

Customizing, option #2: Environment variables

If you’re running Sciagraph with SCIAGRAPH_MODE=process, you can set the SCIAGRAPH_OUTPUT_PATH environment variable to customize where reports are stored.

$ export SCIAGRAPH_MODE=process
$ export SCIAGRAPH_OUTPUT_PATH=./profiling-report
$ python yourprogram.py

Your report will now be stored in the ./profiling-report directory.

Optional: Automatically opening profiling reports in a browser

In process mode, in GUI environments, Sciagraph will automatically open reports in a browser. You can override this with the --open-browser option, e.g.:

$ python -m sciagraph --open-browser=no run yourprogram.py

Possible values are no (never open in a browser), yes (always open), and auto (the default).

You can also control this with the SCIAGRAPH_OPEN_BROWSER environment variable, which takes the same options.