Skip to content

Output Viewers

Pavel Dournov edited this page Nov 5, 2018 · 2 revisions

The ML Pipelines UI has built-in support for several types of visualizations, in order to provide for a rich performance evaluation and comparison experience. Components can leverage these by writing a JSON file at any point during their execution to their local filesystem. The file must be written to the root level, and named: /metadata.json. It includes an array of outputs, each of which describes metadata for an output viewer (discussed below), its structure looks like this:

{
  "version": 1,
  "outputs": [
    {
      "type": "confusion_matrix",
      "format": "csv",
      "source": "dir1/matrix.csv",
      "schema": "dir1/schema.json",
      "predicted_col": "column1",
      "target_col": "column2"
    },
    {
      ...
    }
  ]
}

If such a file is written to the component's container filesystem, it is extracted by ML Pipelines, and used by the UI to generate the specified viewer(s). The metadata specifies where the artifact data should be loaded from, and then the UI loads the data into memory and renders it. It's important to keep this data at a manageable level by the UI, for example by running a sampling step before exporting it as an artifact.

These are the different metadata fields that can be specified:

Field name Description
format Specifies the format of the artifact data, default is 'csv'. NOTE The only format supported as of now is 'csv'.
header A list of strings that are used as the header of the artifact data.
labels A list of strings that are used to label artifact columns/rows.
predicted_col Name of the predicted column.
schema A list of {type, name} objects that specify the schema of the artifact data.
source Full path to data. This can contain wildcards '*', in which case the data is concatenated before it's displayed by the UI.
storage Storage provider service name, default is 'gcs'.
target_col Name of the target column.
type Name of the viewer, one of the ones below.

Below are the different types of viewers supported, and the required metadata fields for each:

Confusion Matrix

Metadata Fields:

  • source
  • labels
  • schema
  • format Plots a Confusion Matrix visualization using the data from the given source path, and the schema to be able to parse the data. Labels provide the names of the classes to be plotted on the x and y axes.

ROC Curve

Metadata Fields:

  • source
  • format
  • schema Plots an ROC curve using the data from the given source path. It assumes the schema includes three columns with the following names: "fpr", "tpr" and "thresholds." Hovering on the ROC curve shows the threshold value used for the cursor's closes fpr and tpr values.

Table

Metadata Fields:

  • source
  • header
  • format Builds an HTML table out of the data at the given source path, where the header field specifies what shows up in the first row of the table. The table supports pagination.

Tensorboard

Metadata Fields:

  • source Adds a "Start Tensorboard" button to the output page. Clicking this button will start a Tensorboard Pod in the Kubernetes cluster, and switch the button to "Open Tensorboard." Clicking this button again opens up the Tensorboard interface in a new tab, pointing it to the logdir data specified in the source field.

It's important to point out that Tensorboard instances are not completely managed by the ML Pipelines UI. The "Start Tensorboard" is only a convenience feature to avoid interrupting the user's workflow when looking at pipeline Runs. The user is responsible for recycling or deleting those Pods separately using their Kubernetes management tools.

Web App

Metadata Fields: source In order to provide the user with more flexibility rendering custom output, this viewer supports specifying an HTML file that is created by the component, and is rendered in the outputs page as is. It's important to note that this file must be self-contained, with no references to other files in the filesystem. It can still have absolute references to files on the web, however. Content running inside this web app is isolated in an iframe, and cannot communicate with the ML Pipelines UI.

Developer Guide

Clone this wiki locally