List of errors from Databricks API

I’m currently working on a project where I’m adapting a code base of Databricks notebooks for a new client. There are a few errors to hunt but the Web UI is not really friendly for this purpose.

Just wanted a quick and easy way to not have to click around to find the issues.

Here’s a quick script to just do that:

import os
import configparser
from databricks_cli.sdk.api_client import ApiClient
from databricks_cli.runs.api import RunsApi

databricks_cfg = '~/.databrickscfg'

conf = configparser.ConfigParser()
conf.read(os.path.expanduser(databricks_cfg))

api_client = ApiClient(
    host=conf["DEFAULT"]["host"],
    token=conf["DEFAULT"]["password"]
)

runs_api = RunsApi(api_client)

json_list = []

for x in range(1,101,25):
    x = runs_api.list_runs(job_id=None, active_only=None,
                           completed_only=None, offset=x,
                           limit=25, version="2.1")
    if len(x["runs"]) > 0:
      for y in x["runs"]:
        if y["state"]["result_state"] == "FAILED":
            z = runs_api.get_run_output(run_id=y["run_id"])
            print(z["metadata"]["task"]["notebook_task"]["notebook_path"], z["error"])

Follow this documentation to install the requirements. There’s a lot more you can do with databricks-cli to make your life easier. It’s a great tool to add to your toolbox.

Have fun!

 Share!