Profiling multiple jobs in a single process

In cases where a single Python process can start multiple jobs you wish to profile separately, you need to make some changes to how you use Sciagraph.

In particular:

  1. You need to run Sciagraph in a way that tells it you will be profiling multiple jobs.
  2. You will need to modify your code to tell Sciagraph when jobs start and finish.

The Sciagraph jobs APIs will quietly do nothing when you are not running under Sciagraph. This allows your code to continue to run in environments where Sciagraph is not configured, for example in unit tests.

Tell Sciagraph you are running multiple jobs by using API mode

If you’re configuring Sciagraph via environment variables, run it with the SCIAGRAPH_MODE environment variable set to api, for example:

$ export SCIAGRAPH_MODE=api
$ python

If you are using the command-line, use the --mode=api flag:

$ python -m sciagraph --mode=api run

Note that:

  1. When you’re using the API mode, you cannot set the output directory or job IDs via environment variables or the command-line option; you will need to use the Python APIs below.
  2. Profiling will not start when the program is started. You will need to explicitly tell Sciagraph to do so, using the APIs we will cover next.

Tell Sciagraph when jobs start and finish

Once you’re running Sciagraph in API mode, you can use the sciagraph.api.profile_job(job_id, output_path) API to wrap your code with a context manager. You should specify a descriptive job ID, and a filesystem path where the report will be stored locally; it will also be uploaded.

from sciagraph.api import profile_job

def run_my_job():
    with profile_job("my-job", output_path="profiling-reports/my-job"):
        # ... your code here ...

When you’re specifying an output path, you should give each job a unique path, and make sure the parent directory (in the example above, "./profiling-reports") is writable.

Note: You can’t nest multiple calls to profile_job(). Doing so will result in an error.