List of errors from Databricks API

17 Feb 2023, 11:01

Azure / Databricks / Python

I’m currently working on a project where I’m adapting a code base of Databricks notebooks for a new client. There are a few errors to hunt but the Web UI is not really friendly for this purpose.

Just wanted a quick and easy way to not have to click around to find the issues.

Here’s a quick script to just do that:

import os, json
import configparser
from databricks_cli.sdk.api_client import ApiClient
from databricks_cli.runs.api import RunsApi


def print_error(nb_path, nb_params, nb_run_url, nb_error="Unknown"):
    error = nb_error.partition("\n")[0]
    params = json.loads(nb_params) if nb_params != "" else {}
    print(
        f"""
Path:	{nb_path}
Params:	{json.dumps(params,indent=2)}
RunUrl:	{nb_run_url}
Error:	{error}
"""
    )


databricks_cfg = "~/.databrickscfg"

conf = configparser.ConfigParser()
conf.read(os.path.expanduser(databricks_cfg))

api_client = ApiClient(
    host=conf["DEFAULT"]["host"],
    token=conf["DEFAULT"]["password"]
)

runs_api = RunsApi(api_client)

for x in range(1, 101, 25):
    x = runs_api.list_runs(
        job_id=None,
        active_only=None,
        completed_only=None,
        offset=x,
        limit=25,
        version="2.1",
    )
    if len(x["runs"]) > 0:
        for y in x["runs"]:
            if y["state"]["result_state"] == "FAILED":
                z = runs_api.get_run_output(run_id=y["run_id"])

                if "error" in z:
                    print_error(
                        z["metadata"]["task"]["notebook_task"]["notebook_path"],
                        z["metadata"]["task"]["notebook_task"]["base_parameters"][
                            "Param1Value"
                        ],
                        z["metadata"]["run_page_url"],
                        z["error"],
                    )
                else:
                    print_error(
                        z["metadata"]["task"]["notebook_task"]["notebook_path"],
                        z["metadata"]["task"]["notebook_task"]["base_parameters"][
                            "Param1Value"
                        ],
                        z["metadata"]["run_page_url"],
                    )

Follow this documentation to install the requirements. There’s a lot more you can do with databricks-cli to make your life easier. It’s a great tool to add to your toolbox.

Have fun!

List of errors from Databricks API

Share!