SQL Server Ingestion Behavior Showing Datasets Twice in DataHub Database Level

Original Slack Thread

Hi everyone,
Every time I use the sql server ingestion through the UI (version I’m getting a strange behaviour inside Datasets at database level, appearing twice with no reason. Screenshot below…

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)


The screenshot is highlighting lusiadas database but it’s also happening with the other one…
Any idea what might be the reason and how to avoid that?

This may be a race condition that happens on the first time you ingest. Can you try rerunning ingestion and seeing if the issue persists?

Executed the second time and “lusiadas” database is just one right now, but it did not work for “armada”

The thing is, it seems that every time I run and get new content it will create a new database with the same name and if I run the second time with no new content it fixes, but not all databases, weird…
Any idea how to avoid this behaviour?

This should be fixed by https://github.com/datahub-project/datahub/pull/9227 once it’s merged and released

Thanks for letting me know Andrew :+1: