how to call databricks notebook from python with rest api

Question

how to call databricks notebook from python with rest api

450 views Asked by J. Doe At 12 May 2023 at 18:47

I want to create a python notebook on my desktop that pass an input to another notebook in databricks, and then return the output of the databricks notebook. For example, my local python file will pass a string into a databricks notebook, which will reverse the string and then output the result back to my local python file. What would be the best way to achieve this?

This is what I have so far. I am getting a response from the api, but I am expecting an attribute in the metadata called "notebook_output". What am I missing to get this response? Or is there somewhere else I can look to get the notebook output from the run?

import os
from databricks_cli.sdk.api_client import ApiClient
from databricks_cli.clusters.api import ClusterApi

os.environ['DATABRICKS_HOST'] = "https://adb-################.##.azuredatabricks.net/"
os.environ['DATABRICKS_TOKEN'] = "token-value" 

api_client = ApiClient(host=os.getenv('DATABRICKS_HOST'), token=os.getenv('DATABRICKS_TOKEN'))

runJson = """
        {
        "name": "test job",
        "max_concurrent_runs": 1,
        "tasks": [
            {
            "task_key": "test",
            "description": "test",
            "notebook_task":
                {
                "notebook_path": "/Users/[email protected]/api_test"
                },
            "existing_cluster_id": "cluster_name",
            "timeout_seconds": 3600,
            "max_retries": 3,
            "retry_on_timeout": true
            }
            ]
        }
        """

runs_api = RunsApi(api_client)
run_id = runs_api.submit_run(runJson)
metadata = runs_api.get_run_output(run_id['run_id'])['metadata']

Output:

{'job_id': 398029273095601, 'run_id': 150609942, 'creator_user_name': 'user', 'number_in_job': 150609942, 'state': {'life_cycle_state': 'TERMINATED', 'result_state': 'SUCCESS', 'state_message': '', 'user_cancelled_or_timedout': False}, 
'task': {'notebook_task': {'notebook_path': 'path', 'source': 'WORKSPACE'}}, 
'cluster_spec': {'existing_cluster_id': 'cluster'}, 
'cluster_instance': {'cluster_id': 'id', 'spark_context_id': 'id'}, 'start_time': 1683904971067, 'setup_duration': 1000, 'execution_duration': 8000, 'cleanup_duration': 0, 'end_time': 1683904981007, 'run_duration': 9940, 'run_name': 
'Untitled', 'run_page_url': 'url', 'run_type': 'SUBMIT_RUN', 'attempt_number': 0, 'format': 'SINGLE_TASK'}

Original Q&A

There are 1 answers

**Abhishek Jain** · Answer 1 · 2023-11-02T15:30:17+00:00

Abhishek Jain On 02 November 2023 at 15:30

To get the notebook_output you need to use api/2.0/jobs/runs/get-output endpoint which returns response as notebook_output

TechQA.

how to call databricks notebook from python with rest api

There are 1 answers

Related Questions in PYTHON

Related Questions in DATABRICKS

Related Questions in DATABRICKS-REST-API

Popular Questions

Trending Questions