Ingesting Power BI Metadata: Lineage Between Page and Report Issue

Original Slack Thread

When ingesting power bi metadata, the lineage between the power bi page and the power bi report is not showing up. It does work between the dataset and the page though.

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

To assist you better, could you please provide the following details?

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g., 0.12.0)?
  3. What data source(s) are you integrating with DataHub? (e.g., BigQuery)?

Based on the details you provided, it seems like you are facing issues with lineage detection between Power BI pages and reports. Here are some relevant points and potential solutions:

Lineage Between Power BI Pages and Reports
According to the <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/docs/sources/powerbi/powerbi_pre.md|PowerBI ingestion documentation>, the extract_lineage parameter should be set to true to extract table lineage for tables present in PowerBI Datasets. However, there might be limitations in how this lineage is represented, especially between different Power BI components like pages and reports.

Configuration Example
Here is an example of how your PowerBI ingestion recipe might look:

source:
  type: powerbi
  config:
    tenant_id: 'your-tenant-id'
    client_id: 'your-client-id'
    client_secret: 'your-client-secret'
    extract_lineage: true
    extract_column_level_lineage: true
    extract_ownership: true
    platform_instance: 'your-platform-instance'
    env: 'PROD'
sink:
  type: datahub-rest
  config:
    server: 'http://localhost:8080'

Known Issues and Feature Requests

  1. Lineage Representation: There are known issues with how lineage is represented in DataHub for Power BI. For example, lineage between different workspaces or between pages and reports might not be fully supported. According to a <Slack conversation>, the current ingestion process might only show the pages used in the report rather than the full lineage.

  2. Feature Requests: If the current capabilities do not meet your needs, you can raise a feature request on GitHub. According to a <Slack conversation>, raising a feature request is the recommended way to get new capabilities added to DataHub.

Would you like more detailed steps or further assistance on any of these points?

Sources:

Please, can you share some concrete examples and a bit more details?