Troubleshooting `powerbi-report-server` Source Configuration

Original Slack Thread

Hi Team,

I am getting below error while ingesting powerbi-report-server source.

platform_instance
  extra fields not permitted (type=value_error.extra)```
Recipe:

```source:
  type: powerbi-report-server
  config:
    platform_instance: POWERBI.OCE.DEV
    username: '${POWERBI_USERNAME}'
    password: '${POWERBI_PASSWORD}'
    workstation_name: corpdev
    host_port: '********.<http://visa.com:443|visa.com:443>'
    server_alias: server_alias
    env: DEV
    report_virtual_directory_name: Reports
    report_server_virtual_directory_name: ReportServer
    extract_ownership: true
    ownership_type: TECHNICAL_OWNER
pipeline_name: "POWERBI DEV"
sink:
  type: datahub-rest
  config:
    server: '${DATAHUB_GMS_HOST}'
    token: '${DATAHUB_GMS_TOKEN}'
    max_threads: 1```

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses! We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic and to help us improve the assistant!

The error message extra fields not permitted (type=value_error.extra) indicates that the platform_instance field is not recognized in the powerbi-report-server source configuration. This suggests that the platform_instance parameter is not supported for this source type.

Here are the steps to resolve this issue:

  1. Remove the platform_instance Field: Since the platform_instance field is not supported, you should remove it from your configuration.

  2. Verify Other Configuration Parameters: Ensure that all other parameters are correctly set according to the <PowerBI | DataHub Report Server Ingestion Documentation>.

Here is the updated ingestion recipe without the platform_instance field:

source:
  type: powerbi-report-server
  config:
    username: '${POWERBI_USERNAME}'
    password: '${POWERBI_PASSWORD}'
    workstation_name: corpdev
    host_port: '********.visa.com:443'
    server_alias: server_alias
    env: DEV
    report_virtual_directory_name: Reports
    report_server_virtual_directory_name: ReportServer
    extract_ownership: true
    ownership_type: TECHNICAL_OWNER
pipeline_name: "POWERBI DEV"
sink:
  type: datahub-rest
  config:
    server: '${DATAHUB_GMS_HOST}'
    token: '${DATAHUB_GMS_TOKEN}'
    max_threads: 1

Would you like more detailed steps or further assistance on any specific part of this configuration?

Sources:

  • <PowerBI | DataHub Report Server Ingestion Documentation> :+1: 0 button :-1: 0 button :bulb: Hint: Mention @runllm in the thread for followups.

<@U06TM7M40RK>
How to add platform instance?

To add a platform instance in DataHub, you need to ensure that the ingestion source you are using supports the platform_instance parameter. Based on the provided data sources, the powerbi-report-server source does not support the platform_instance parameter, which is why you encountered the error.

However, for sources that do support platform_instance, you can follow these general steps:

  1. Specify the platform_instance in the Ingestion Recipe: Add the platform_instance parameter to the source configuration in your ingestion recipe.

  2. Verify the Configuration: Ensure that the platform_instance is correctly specified and that the source supports this parameter.

Here is an example of how to configure a source that supports platform_instance, such as mysql:

source:
  type: mysql
  config:
    host_port: localhost:3306
    platform_instance: core_finance
    database: dbname
    username: root
    password: example
sink:
  type: datahub-rest
  config:
    server: 'http://localhost:8080'

For more detailed steps, you can refer to the <https://datahubproject.io/docs/0.13.1/platform-instances|Platform Instances Documentation>.

If you need to add a platform instance to a container manually, you can use the MetadataChangeProposalWrapper to associate the container with a platform instance. Here is an example in Python:

from datahub.metadata.schema_classes import DataPlatformInstance
from datahub.emitter.mce_builder import make_data_platform_urn, make_dataplatform_instance_urn
from datahub.emitter.mcp import MetadataChangeProposalWrapper

container_urn = "urn:li:container:(urn:li:dataPlatform:mysql,container_name,PROD)"
platform_name = "mysql"
platform_instance = "core_finance"

mcp = MetadataChangeProposalWrapper(
    entityUrn=container_urn,
    aspect=DataPlatformInstance(
        platform=make_data_platform_urn(platform_name),
        instance=make_dataplatform_instance_urn(platform_name, platform_instance)
    ),
).as_workunit()

Would you like more detailed steps or further assistance on any specific part of this process?

Sources: