Investigating metadata ingestion issues and data loss in Datahub 0.10.5.5 with potential solutions

user-2 · March 4, 2024, 5:01pm

Hello! I’m having some issues with metadata ingestion. Overnight, a large chunk (almost all) of our ingested data was erased and I can’t figure out why. Any ideas?? Thanks!

user-2 · March 4, 2024, 5:01pm

I am currently using datahub version 0.10.5.5

user-1 · March 4, 2024, 5:01pm

Hi!
From my user perspective of Datahub, I thought about a possible reason for your issue.
Have you enabled the option stateful_ingestion inside your connector recipe? If so, the option _stateful_ingestion._remove_stale_metadata is enabled by default. This option soft-deletes the entities that existed in the last successful run but are missing in the current execution.
Another problem could be related to some issue with your persistence tool.

user-2 · March 4, 2024, 5:01pm

The problem is I wasn’t ingesting anything. Is there any way for me to check what’s wrong with the persistence tool

user-2 · March 4, 2024, 5:01pm

Also ingestion seems to be taking incredibly long relative to before all the data disappeared

user-1 · March 4, 2024, 5:01pm

If you are using AWS services (RDS, Opensearch and MSK), you can check CloudWatch logs, the metrics of each tool and the if recently they made some automatic maintenance/update.

user-2 · March 4, 2024, 5:01pm

I’m using the docker compose files

user-2 · March 4, 2024, 5:01pm

Upon checking the docker logs for the mysql container, I get many bad handshake errors. Is there any solution to this?

Topic		Replies	Views
Resolving Stateful Ingestion Error: Authorization and Dataset Management ingestion	2	10	April 28, 2025
Stateful Ingestion Behavior in DataHub 0.13.3 with Metabase Integration ingestion	7	33	July 1, 2024
Troubleshooting Metadata Ingestion Error 'Failed to produce MCLs' ingestion	8	100	June 3, 2024
Understanding Stateful Ingestion and Metadata Management in DataHub ingestion	8	6	March 31, 2025
Cleaning up DataHub from Dropped Tables through Stateful Ingestion ingestion	6	53	July 29, 2024

Investigating metadata ingestion issues and data loss in Datahub 0.10.5.5 with potential solutions

Related topics