<@U06TM7M40RK> what does extract_lineage_from_catalog configuration in GBQ do? Is it redundant if lineage_use_sql_parser: true and extract_column_lineage: true configurations are set?
The extract_lineage_from_catalog
configuration in Google BigQuery (GBQ) enables the extraction of data lineage from the Data Lineage API exposed by Google Data Catalog. This feature is specifically designed to extract lineage information directly from the catalog, which can be useful if you want to leverage Google’s built-in lineage capabilities.
According to the <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/bigquery#config-details|DataHub documentation>, this extractor cannot build views lineage, and it is recommended to enable the view’s DDL parsing for comprehensive lineage extraction.
On the other hand, the lineage_use_sql_parser
configuration uses a SQL parser to resolve view/table lineage, and extract_column_lineage
generates column-level lineage, provided that lineage_use_sql_parser
is enabled.
These configurations are not necessarily redundant but rather complementary. The extract_lineage_from_catalog
relies on Google’s Data Catalog API, which might provide different or additional lineage information compared to what can be parsed from SQL directly. Using both methods can provide a more comprehensive lineage picture, especially if there are limitations in one method that the other can cover.
Would you like more detailed information on how these configurations interact or any specific use cases?
Sources:
- <https://datahubproject.io/docs/0.14.0/generated/ingestion/sources/bigquery#config-details|DataHub BigQuery Config Details>
0 button
0 button
Hint: Mention @runllm in the thread for followups.
Hi <@U07HF0U8716>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.
Hi ! Did this interaction resolve your issue or do you still require assistance ?
Please tag @ RunLLM if you still need further assistance !
Or tag @Datahub Community Support if you require a person to take a look
Hope this helps ! ><