How does DataHub handle changes during scheduled ingestion runs across multiple data sources?

Original Slack Thread

Hello I have a question, when running ingestion on a schedule, does datahub read the entire source or does it only look at the changes that have been made to the source since the last ingestion run? e.g. if I connect to snowflake, and there are three tables A, B, C the first time ingestion is run, and after the first ingestion run if only C has changes, will datahub re-read A, B, and C or will it only look at C? Thanks!

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

I am using the UI for ingestion, v0.12.1, and we’re integrating snowflake, vertica, postgres, tableau, s3, airflow