<@U06TM7M40RK> What is the difference between the extract_column_lineage
and include_column_lineage
configs in the Bigquery metadata ingestion recipe?
It seems likely that the extract_column_lineage
and include_column_lineage
configurations in the BigQuery metadata ingestion recipe serve different purposes related to lineage extraction.
-
extract_column_lineage
: This configuration is used to enable the generation of column-level lineage. It requires thelineage_use_sql_parser
to be enabled. This means that when this option is set to true, the system will attempt to generate lineage information at the column level, which involves understanding how data flows between columns across different tables or views. -
include_column_lineage
: This configuration is not explicitly mentioned in the provided data sources for BigQuery. However, based on similar configurations in other contexts, it might be used to control whether the extracted column-level lineage should be included in the ingestion process. This could mean that even if column-level lineage is extracted, it might not be included in the final metadata unless this option is enabled.
For more detailed information, you might want to refer to the <BigQuery | DataHub documentation for BigQuery ingestion> and the <About DataHub Lineage | DataHub feature guide>.
Would you like more detailed information on how these configurations are typically used in practice?
Sources:
- <BigQuery | DataHub documentation for BigQuery ingestion>
- <About DataHub Lineage | DataHub feature guide> 0 button 0 button Hint: Mention @runllm in the thread for followups.
<@U06TM7M40RK> I am unable to see column lineage on the datahub console UI even toh the extract_column_lineage
is set to true. Why is that so?
If you are unable to see column lineage in the DataHub console UI despite having extract_column_lineage
set to true, there could be several reasons for this issue. Here are some steps and considerations to help troubleshoot and resolve the problem:
-
Verify Configuration:
- Ensure that all necessary configurations for column-level lineage are correctly set in your ingestion recipe. This includes enabling
extract_column_lineage
and any other related settings such aslineage_use_sql_parser
.
- Ensure that all necessary configurations for column-level lineage are correctly set in your ingestion recipe. This includes enabling
-
Check Ingestion Logs:
- Review the logs from your ingestion process to ensure that column-level lineage data is being captured and sent to DataHub. Look for any errors or warnings related to lineage extraction.
-
DataHub UI Configuration:
- Make sure that the DataHub UI is configured to display column-level lineage. In the UI, navigate to the dataset’s “Lineage” tab and enable the “Show Columns” option to visualize column-level lineage.
-
Verify Metadata Files:
- Manually inspect the metadata files (e.g., manifest.json, catalog.json) to ensure they contain the expected column-level lineage information.
-
Check for Known Issues:
- Refer to the <https://github.com/datahub-project/datahub/issues|DataHub GitHub issues> for any known issues or updates related to column-level lineage.
-
Version Compatibility:
- Ensure that you are using compatible versions of DataHub and the ingestion framework. Sometimes, version mismatches can cause issues with specific features like column-level lineage.
-
Manual Lineage Editing:
- If automatic extraction is not working, consider manually adding or editing lineage edges in the DataHub UI.
If you have verified these steps and still do not see column-level lineage, it might be a specific issue with your setup or a bug in the version you are using. Would you like more detailed steps or further assistance with specific configurations?
Sources:
- https://datahubspace.slack.com/archives/C029A3M079U/p1726506291.721859|https://datahubspace.slack.com/archives/C029A3M079U/p1726506291.721859
- https://datahubproject.io/docs/0.14.0/api/tutorials/lineage#add-column-level-lineage|https://datahubproject.io/docs/0.14.0/api/tutorials/lineage#add-column-level-lineage 0 button 0 button Hint: Mention @runllm in the thread for followups.