Resolving Metabase Ingestion Warnings and Understanding DataHub's Lineage Support

Original Slack Thread

<@U06TM7M40RK> for ingestion from Metabase I am receiving following two warnings:

  1. Unrecognized Data Platform found. Data Platform was not found. Using platform name as is. Platform: redshift
  2. Cannot resolve Database Name. Cannot determine database name for platform. Platform: redshift
    How should I adjust ingestion recipe? Database in Metabase has the name “My Redshift PoC”

To address the warnings you’re encountering during Metabase ingestion related to the platform and database name, you can adjust your ingestion recipe by using the engine_platform_map and database_alias_map configurations. Here’s how you can do it:

  1. Unrecognized Data Platform Warning: This warning suggests that the platform name is not being recognized correctly. You can use the engine_platform_map to map the Metabase engine to the correct DataHub platform.

  2. Cannot Resolve Database Name Warning: This warning indicates that the database name is not being resolved correctly. You can use the database_alias_map to provide a user-friendly name for the database.

Here’s an example configuration for your ingestion recipe:

source:
  type: metabase
  config:
    connect_uri: "http://your-metabase-instance:3000"
    username: "your-username"
    password: "your-password"
    engine_platform_map:
      redshift: redshift
    database_alias_map:
      "My Redshift PoC": "my_redshift_poc"
  • engine_platform_map: This maps the Metabase engine name to the DataHub platform name. In this case, it maps redshift to redshift.
  • database_alias_map: This provides a mapping from the Metabase database name to a user-friendly name that DataHub can use.

These configurations should help resolve the warnings by ensuring that the platform and database names are correctly recognized and mapped during ingestion.

Would you like more detailed steps or further assistance with any specific part of the process?

Sources:

Thank you!

<@U06TM7M40RK>, please, does DataHub supports column level lineage between Metabase and its source?

DataHub does support column-level lineage, but its availability depends on the specific data source you are integrating with. According to the <https://github.com/datahub-project/datahub/blob/master/docs/lineage/sql_parsing.md|DataHub documentation>, column-level lineage is supported for sources like BigQuery, Snowflake, dbt, Looker, and PowerBI. However, for Metabase, column-level lineage is not supported out of the box, as mentioned in a <Slack discussion>.

If you are using Metabase with a supported source like BigQuery or Snowflake, you might be able to achieve column-level lineage by leveraging the capabilities of those sources. For unsupported sources, you can use the <https://github.com/datahub-project/datahub/blob/master/docs/lineage/sql_parsing.md|DataHub SQL Parser> to generate column-level lineage from SQL query logs with 97-99% accuracy.

Would you like more details on how to set up column-level lineage for a specific data source or any other aspect of DataHub’s lineage capabilities?

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><