Hello! I’m having some issues with metadata ingestion. Overnight, a large chunk (almost all) of our ingested data was erased and I can’t figure out why. Any ideas?? Thanks!
I am currently using datahub version 0.10.5.5
Hi!
From my user perspective of Datahub, I thought about a possible reason for your issue.
Have you enabled the option stateful_ingestion
inside your connector recipe? If so, the option _stateful_ingestion._remove_stale_metadata
is enabled by default. This option soft-deletes the entities that existed in the last successful run but are missing in the current execution.
Another problem could be related to some issue with your persistence tool.
The problem is I wasn’t ingesting anything. Is there any way for me to check what’s wrong with the persistence tool
Also ingestion seems to be taking incredibly long relative to before all the data disappeared
If you are using AWS services (RDS, Opensearch and MSK), you can check CloudWatch logs, the metrics of each tool and the if recently they made some automatic maintenance/update.
I’m using the docker compose files
Upon checking the docker logs for the mysql container, I get many bad handshake errors. Is there any solution to this?