Access the metadata of a profiling job

Sometimes you want to do something automated with the metadata from a job.

Accessing job metadata at runtime

Sciagraph uses the Python logging library to record the download instructions. logging is highly customizable, so you can override where Sciagraph’s download instructions log message goes. For example, you can write a custom logging.Hander that sends messages to Slack.

The download instructions are logging to a "sciagraph" logging.Logger. The actual log message include a sciagraph.api.ReportResult object with details on how to download the report. Here’s a custom logging.Handler that logs the report download instructions to a JSON file:

import logging
import json

class SciagraphReportHandler(logging.Handler):
    def __init__(self, path):
        self._path = path
        logging.Handler.__init__(self)

    def emit(self, record):
        from sciagraph.api import ReportResult

        if isinstance(record.msg, ReportResult):
            with open(self._path, "w") as f:
                json.dump(
                    {
                         "download_instructions": record.getMessage(),
                         "job_id": record.msg.job_id
                    },
                    f,
                )
        else:
            print(record.getMessage())


logging.getLogger("sciagraph").addHandler(
     SciagraphReportHandler("./result.json")
)

The sciagraph.api.RecordResult object has the following fields, all of them strings unless noted otherwise:

  • job_time: The time the job was started.
  • job_id: The job ID.
  • download_key: The first argument to python -m sciagraph_report download.
  • decryption_key: The second argument to python -m sciagraph_report download.
  • peak_memory_kb: The peak allocated memory for the job, in kibibytes (==1024 bytes), as an integer.

Accessing job metadata from a report directory

Each report (starting with v2023.5.0) will have a "report.json" file in the report directory with the following fields:

  • start_time_secs_since_epoch: The time the job was started, as seconds since the unix epoch (Jan 1, 1970).
  • duration_ms: The job duration, in milliseconds.
  • id: The job ID.
  • peak_memory_kb: The peak allocated memory for the job, in kibibytes (==1024 bytes), as an integer.