I’m currently working on a project where I’m adapting a code base of Databricks notebooks for a new client. There are a few errors to hunt but the Web UI is not really friendly for this purpose.
Just wanted a quick and easy way to not have to click around to find the issues.
Here’s a quick script to just do that:
import os, json
import configparser
from databricks_cli.sdk.api_client import ApiClient
from databricks_cli.runs.api import RunsApi
def print_error(nb_path, nb_params, nb_run_url, nb_error="Unknown"):
error = nb_error.partition("\n")[0]
params = json.loads(nb_params) if nb_params != "" else {}
print(
f"""
Path: {nb_path}
Params: {json.dumps(params,indent=2)}
RunUrl: {nb_run_url}
Error: {error}
"""
)
databricks_cfg = "~/.databrickscfg"
conf = configparser.ConfigParser()
conf.read(os.path.expanduser(databricks_cfg))
api_client = ApiClient(
host=conf["DEFAULT"]["host"],
token=conf["DEFAULT"]["password"]
)
runs_api = RunsApi(api_client)
for x in range(1, 101, 25):
x = runs_api.list_runs(
job_id=None,
active_only=None,
completed_only=None,
offset=x,
limit=25,
version="2.1",
)
if len(x["runs"]) > 0:
for y in x["runs"]:
if y["state"]["result_state"] == "FAILED":
z = runs_api.get_run_output(run_id=y["run_id"])
if "error" in z:
print_error(
z["metadata"]["task"]["notebook_task"]["notebook_path"],
z["metadata"]["task"]["notebook_task"]["base_parameters"][
"Param1Value"
],
z["metadata"]["run_page_url"],
z["error"],
)
else:
print_error(
z["metadata"]["task"]["notebook_task"]["notebook_path"],
z["metadata"]["task"]["notebook_task"]["base_parameters"][
"Param1Value"
],
z["metadata"]["run_page_url"],
)
Follow this documentation to install the requirements. There’s
a lot more you can do with databricks-cli
to make your life easier.
It’s a great tool to add to your toolbox.
Have fun!