Keep track of Sciagraph reports

When a job being profiled by Sciagraph finishes, it is uploaded in encrypted form to the Sciagraph cloud server, so you can retrieve it later even if your runtime environment goes away. Sciagraph then writes out a log message to the "sciagraph" logging.Logger—the standard Python logging library. That log message will tell you how to download the report.

See the article on security and privacy of reports to learn the security implications of this design, and your responsibilities if you want to keep your profiling reports private.

You can also store reports yourself if you want.

Default behavior

If you’ve overridden the default logging destination globally, the log message will go wherever you’ve directed log messages. Otherwise, by default log messages in Python go to standard out.

Either way, you will have a normal log message somewhere at the end of your logs that will tell you how to download the report. For example, here’s an example log message:

Successfully uploaded the Sciagraph profiling report.

Job start time: 2022-03-02T10:17:36+00:00
Job ID: Unknown

To see the resulting profiling report, run the following on
Linux/Windows/macOS, Python 3.7+.

If you're inside a virtualenv:

    pip install --upgrade sciagraph-report

Otherwise:

    pip install --user --upgrade sciagraph-report

Then:

    python -m sciagraph_report download 476e2b1a-8c3b-4c25-bff7-b69860b1200a LiEQsqO7U/Q9ygeOESWL3ekWT9zVniTtgHOBqEO2xXjSKLiGfLCH

If you just follow those instructions on your computer, you’ll end up with the report open in a browser.

By default, then, to see the profiling report for a batch job you just go look at that job’s logs; the download instructions should be one of the last messages in the log.

Customizing where the download instructions go

The Python logging library is highly customizable, so you can override where this log message goes. For example, you can write a custom logging.Hander that sends messages to Slack.

The actual log message include a sciagraph.api.ReportResult object with details on how to download the report. Here’s a custom logging.Handler that logs the report download instructions to a JSON file:

import logging
import json

class SciagraphReportHandler(logging.Handler):
    def __init__(self, path):
        self._path = path
        logging.Handler.__init__(self)

    def emit(self, record):
        from sciagraph.api import ReportResult

        if isinstance(record.msg, ReportResult):
            with open(self._path, "w") as f:
                json.dump(
                    {
                         "download_instructions": record.getMessage(),
                         "job_id": record.msg.job_id
                    },
                    f,
                )
        else:
            print(record.getMessage())


logging.getLogger("sciagraph").addHandler(
     SciagraphReportHandler("./result.json")
)

The sciagraph.api.RecordResult object has the following fields, all of them strings unless noted otherwise:

  • job_time: The time the job was started.
  • job_id: The job ID.
  • download_key: The first argument to python -m sciagraph_report download.
  • decryption_key: The second argument to python -m sciagraph_report download.
  • peak_memory_kb: The peak allocated memory for the job, in kibibytes (==1024 bytes), as an integer.