Investigating metadata ingestion issues and data loss in Datahub with potential solutions

Original Slack Thread

Hello! I’m having some issues with metadata ingestion. Overnight, a large chunk (almost all) of our ingested data was erased and I can’t figure out why. Any ideas?? Thanks!

I am currently using datahub version

From my user perspective of Datahub, I thought about a possible reason for your issue.
Have you enabled the option stateful_ingestion inside your connector recipe? If so, the option _stateful_ingestion._remove_stale_metadata is enabled by default. This option soft-deletes the entities that existed in the last successful run but are missing in the current execution.
Another problem could be related to some issue with your persistence tool.

The problem is I wasn’t ingesting anything. Is there any way for me to check what’s wrong with the persistence tool

Also ingestion seems to be taking incredibly long relative to before all the data disappeared

If you are using AWS services (RDS, Opensearch and MSK), you can check CloudWatch logs, the metrics of each tool and the if recently they made some automatic maintenance/update.

I’m using the docker compose files

Upon checking the docker logs for the mysql container, I get many bad handshake errors. Is there any solution to this?