Ingesting dbt Cloud with Snowflake: Metadata Order Considerations

Original Slack Thread

<@U06TM7M40RK> When ingesting dbt Cloud with Snowflake as the target platform, is it generally advised to first ingest the Snowflake metadata, or dbt Cloud? Does it even matter?

When ingesting metadata from both dbt Cloud and Snowflake, it is generally recommended to ingest both sources, but the order does not strictly matter. According to the <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/docs/sources/dbt/README.md|dbt ingestion documentation>, you must run ingestion for both dbt and your data warehouse (target platform), and they can be run in any order. This is because the ingestion process will generate lineage between dbt nodes and the underlying target platform nodes, such as Snowflake tables or views.

However, it is important to note that ingesting Snowflake metadata can provide higher quality column lineage compared to dbt, as dbt requires SQL parsing to achieve the same level of detail. Therefore, ingesting Snowflake metadata might be beneficial for obtaining more detailed lineage information.

Would you like more detailed information on any specific aspect of this setup?

Sources:

Hi <@U07UX44AHDH>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><