Rollback Ingestion Run Using Python SDK and REST API

user-2 · October 7, 2024, 12:05am

Is it possible to rollback an ingestion run with the python sdk without using the datahub cli?

datahub_team · October 7, 2024, 12:05am

Hey there!

<@U06TM7M40RK> Can you help with the questions in this thread?

Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!

user-1 · October 7, 2024, 12:05am

Yes, it is possible to rollback an ingestion run using the Python SDK without using the DataHub CLI. The Python SDK’s DataHubGraph client supports various delete operations, including rolling back an ingestion run.

To rollback an ingestion run programmatically, you can use the rollback method provided by the SDK. Here is a step-by-step guide on how to achieve this:

Initialize the DataHubGraph Client: First, you need to initialize the DataHubGraph client.
Execute the Rollback: Use the rollback method to rollback the ingestion run by specifying the run_id.

Here is an example code snippet:

from datahub.emitter.mce_builder import datahub_graph

# Initialize the DataHubGraph client
client = datahub_graph.DataHubGraph()

# Rollback the ingestion run
run_id = "<your_run_id>"
client.rollback(run_id=run_id)

This will rollback the specified ingestion run, deleting the entities created by the run and reverting the updated aspects.

Would you like more details on any specific part of this process?

Sources:

<https://github.com/datahub-project/datahub/blob/master/docs/how/delete-metadata.md|Delete Metadata> 0 button 0 button Hint: Mention @runllm in the thread for followups.

user-2 · October 7, 2024, 12:05am

<@U06TM7M40RK> This doesn’t work, there is no rollback method

user-1 · October 7, 2024, 12:05am

You are correct. The provided data sources do not mention a rollback method in the Python SDK. The rollback functionality is primarily documented for use via the DataHub CLI and the REST API.

To rollback an ingestion run programmatically, you would need to interact with the REST API directly. Here is how you can do it using Python’s requests library:

Set up the necessary imports and client configuration.
Send a POST request to the rollback endpoint with the required payload.

Here is an example code snippet:

import requests
import json

# Configuration
gms_host = "http://localhost:8080"  # Replace with your DataHub GMS host
run_id = "<your_run_id>"  # Replace with your ingestion run ID
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your_access_token>"  # Replace with your access token if needed
}

# Payload for the rollback request
payload = {
    "runId": run_id,
    "dryRun": False,  # Set to True if you want to perform a dry run
    "safe": True  # Set to False if you want to perform an unsafe rollback
}

# Send the rollback request
response = requests.post(f"{gms_host}/runs?action=rollback", headers=headers, data=json.dumps(payload))

# Check the response
if response.status_code == 200:
    print("Rollback successful")
else:
    print(f"Failed to rollback: {response.status_code} - {response.text}")

This code sends a POST request to the DataHub GMS endpoint to rollback the specified ingestion run.

Would you like more details on any specific part of this process?

Sources:

<https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/cli/ingest_cli.py|Ingestion CLI> 0 button 0 button Hint: Mention @runllm in the thread for followups.

datahub_team · October 7, 2024, 12:05am

Hi ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look

Hope this helps ! ><

Topic		Replies	Views
Impact of `datahub delete --platform X` Command on Future Ingestions ingestion	4	43	June 10, 2024
Deleting Wrong Dataset Ingestion in DataHub ingestion	4	24	February 17, 2025
Using DataHubGraph for Monitoring and Analytics ingestion	10	26	April 28, 2025
Troubleshooting Stuck DataHub Ingestions from Terminal ingestion	2	29	September 9, 2024
Troubleshooting DataHub Ingestion Issues ingestion	2	77	September 2, 2024

Rollback Ingestion Run Using Python SDK and REST API

Hey there!

<@U06TM7M40RK> Can you help with the questions in this thread?

Related topics