Troubleshooting Tips for Hanging DBT Cloud Ingestion Jobs

user-3 · January 13, 2025, 12:05am

<@U06TM7M40RK> When running several dbt cloud ingest jobs simultanelously, we often experience the ingestion jobs hanging, ie never timing out nor being cancelled or completing

user-1 · January 13, 2025, 12:05am

When running multiple dbt cloud ingestion jobs simultaneously, and experiencing issues with jobs hanging, there are several steps you can take to troubleshoot and potentially resolve the issue:

Resource Allocation: Ensure that your datahub-actions pod has sufficient CPU and memory allocated. Inadequate resources can cause ingestion jobs to hang. You can verify and adjust resource allocation using Kubernetes commands.
Timeout Configuration: You can set a timeout for your ingestion jobs by configuring the timeout_sec parameter in your ingestion recipe. This parameter specifies the per-HTTP request timeout in seconds. However, if this parameter is not having the desired effect, you might need to set a global timeout for the entire ingestion job by configuring the Kubernetes CronJob or Job to have a timeout using the activeDeadlineSeconds field.
Cancel Stuck Jobs: If an ingestion job is hanging, you can manually stop the process by identifying the process ID (PID) and killing it. This can be done using Docker commands if you are running DataHub in Docker.
Check Logs for Issues: After stopping the stuck ingestion, it is important to check the logs to understand why the ingestion got stuck. You can inspect the logs of the datahub-actions container.
Stagger Ingestion Runs: To avoid overloading the system, consider staggering your ingestion runs so that they do not overlap.
Use Kafka Sink for Scalability: If you are using the datahub-rest sink and experiencing scalability issues, consider using the Kafka sink for better scalability.

For more detailed steps on any of these points, please let me know.

Sources:

<https://github.com/datahub-project/datahub/blob/master/docs/ui-ingestion.md|DataHub UI Ingestion Documentation> 0 button 0 button Hint: Mention @runllm in the thread for followups.

user-3 · January 13, 2025, 12:05am

<@U06TM7M40RK> Are there any other global configurations of the actions pod that might allow the pod to better utilize a form of ingestion queue?

user-1 · January 13, 2025, 12:05am

I’m not able to find any data sources to answer your question. You might want to specify the current configuration settings you are using for the actions pod. If this doesn’t help, please reach out to the DataHub support team for further assistance. 0 button 0 button Hint: Mention @runllm in the thread for followups.

datahub_team · January 13, 2025, 12:05am

Hi ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look

Hope this helps ! ><

Topic		Replies	Views
Troubleshooting Stuck DataHub Ingestions from Terminal ingestion	2	38	September 9, 2024
Troubleshooting Stuck DBT Ingestion Job from AWS S3 to DataHub ingestion	11	4	January 13, 2025
Troubleshooting DataHub Ingestion Issue and Action Pod Status ingestion	2	58	March 4, 2024
Troubleshooting DataHub Ingestion Issues ingestion	2	89	September 2, 2024
Troubleshooting dbt-cloud Ingestion Errors ingestion	6	82	June 3, 2024

Troubleshooting Tips for Hanging DBT Cloud Ingestion Jobs

Related topics