Issues with DataHub CLI Version 1.0.0.2 Upgrade

Original Slack Thread

Hi all,
I’ve been using CLI version 0.15.0.5 with DBT, and it has worked well. However, after upgrading to version 1.0.0.2, I’ve noticed a few significant issues:

  1. Missing Raw and Formatted SQL: Previously, I was able to view both the raw and formatted SQL. After the upgrade, I only see the view definition. I’ve attached a screenshot for reference.
  2. Column Lineage Not Working: This is the most critical issue—column lineage is no longer displaying correctly. The lineage information appears to be missing or inaccurately presented.
  3. This mainly happens between bigquery to dbt, from bigquery 2 bigquery it works fine..
    Has anyone else experienced similar issues after upgrading? Would appreciate any input or potential workarounds.attachmentattachment

Hey there! :wave:

<@U06TM7M40RK> Can you help with the questions in this thread?

:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!

<@U06TM7M40RK>
Hi all,
I’ve been using CLI version 0.15.0.5 with DBT, and it has worked well. However, after upgrading to version 1.0.0.2, I’ve noticed a few significant issues:

  1. Missing Raw and Formatted SQL: Previously, I was able to view both the raw and formatted SQL. After the upgrade, I only see the view definition. I’ve attached a screenshot for reference.
  2. Column Lineage Not Working: This is the most critical issue—column lineage is no longer displaying correctly. The lineage information appears to be missing or inaccurately presented.
    Has anyone else experienced similar issues after upgrading? Would appreciate any input or potential workarounds.

To address the issues you’re experiencing after upgrading to DataHub CLI version 1.0.0.2, let’s break down the potential causes and solutions based on the available data sources:

  1. Missing Raw and Formatted SQL:

    • The upgrade to version 1.0.0.2 might have introduced changes in how SQL definitions are handled. According to the <https://github.com/datahub-project/datahub/blob/master/docs/how/updating-datahub.md|DataHub documentation>, there have been several breaking changes and updates in recent versions. It’s possible that the way SQL is displayed has been altered or requires additional configuration.
    • Check if there are any new configuration options related to SQL display in the updated version. You might need to adjust your ingestion recipe or settings to restore the previous behavior.
  2. Column Lineage Not Working:

Potential Workarounds:

  • Revert to Previous Version: If the issues are critical and you need an immediate solution, consider reverting to the previous version (0.15.0.5) where everything was working as expected.
  • Check Release Notes: Review the release notes for version 1.0.0.2 to identify any changes that might affect your setup. This can provide insights into new configurations or known issues.
  • Community Support: Engage with the DataHub community on Slack or GitHub to see if others have faced similar issues and found solutions.

Would you like more detailed steps on any of these points or additional troubleshooting tips?

Sources:

<@U06TM7M40RK> but the default is TRUE for include_column_lineage…

Yes, you are correct that the default setting for include_column_lineage is true in DataHub’s dbt ingestion configuration. This means that column-level lineage should be enabled by default. However, if you are experiencing issues with column lineage not displaying correctly after the upgrade, there could be other factors at play. Here are some additional considerations and steps you can take:

  1. Verify Configuration: Double-check your ingestion configuration to ensure that include_column_lineage is explicitly set to true. Although it defaults to true, it’s good practice to confirm this setting in your configuration file.

  2. Check for Known Issues: There might be specific bugs or issues introduced in version 1.0.0.2 that affect column lineage. Reviewing the <https://github.com/datahub-project/datahub/issues|GitHub issues> for any reports related to column lineage problems in this version can provide insights.

  3. Review Logs: Examine the logs generated during the ingestion process for any warnings or errors related to column lineage. This can help identify if there are any underlying issues affecting the lineage extraction.

  4. Compatibility and Dependencies: Ensure that all dependencies and related components (e.g., dbt version, DataHub plugins) are compatible with the new CLI version. Incompatibilities can sometimes lead to unexpected behavior.

  5. Community Feedback: Engage with the DataHub community on Slack or GitHub to see if others have encountered similar issues and have found solutions or workarounds.

If you need more detailed troubleshooting steps or further assistance, please let me know!

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><