Explanation of Stateful Ingestion and its Impact on Catalog Maintenance

Original Slack Thread

Can someone help explain what Stateful Ingestion does when it comes to ingesting metadata, thank you very much

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

it stores the state with which pipeline (recipe) the asset was ingested in the catalog. And if in the next run of the same pipeline the asset is not ingested/updated again it will be soft-deleted from the catalog