<@U06TM7M40RK> for ingestion from Metabase I am receiving following two warnings:
Unrecognized Data Platform found. Data Platform was not found. Using platform name as is. Platform: redshift
Cannot resolve Database Name. Cannot determine database name for platform. Platform: redshift
How should I adjust ingestion recipe? Database in Metabase has the name “My Redshift PoC”
To address the warnings you’re encountering during Metabase ingestion related to the platform and database name, you can adjust your ingestion recipe by using the engine_platform_map and database_alias_map configurations. Here’s how you can do it:
Unrecognized Data Platform Warning: This warning suggests that the platform name is not being recognized correctly. You can use the engine_platform_map to map the Metabase engine to the correct DataHub platform.
Cannot Resolve Database Name Warning: This warning indicates that the database name is not being resolved correctly. You can use the database_alias_map to provide a user-friendly name for the database.
Here’s an example configuration for your ingestion recipe:
engine_platform_map: This maps the Metabase engine name to the DataHub platform name. In this case, it maps redshift to redshift.
database_alias_map: This provides a mapping from the Metabase database name to a user-friendly name that DataHub can use.
These configurations should help resolve the warnings by ensuring that the platform and database names are correctly recognized and mapped during ingestion.
Would you like more detailed steps or further assistance with any specific part of the process?
DataHub does support column-level lineage, but its availability depends on the specific data source you are integrating with. According to the <https://github.com/datahub-project/datahub/blob/master/docs/lineage/sql_parsing.md|DataHub documentation>, column-level lineage is supported for sources like BigQuery, Snowflake, dbt, Looker, and PowerBI. However, for Metabase, column-level lineage is not supported out of the box, as mentioned in a <Slack discussion>.
If you are using Metabase with a supported source like BigQuery or Snowflake, you might be able to achieve column-level lineage by leveraging the capabilities of those sources. For unsupported sources, you can use the <https://github.com/datahub-project/datahub/blob/master/docs/lineage/sql_parsing.md|DataHub SQL Parser> to generate column-level lineage from SQL query logs with 97-99% accuracy.
Would you like more details on how to set up column-level lineage for a specific data source or any other aspect of DataHub’s lineage capabilities?