Troubleshooting Kafka Topic Ingestion Error and Configuration Example

Original Slack Thread

Hello, I’m struggeling with ingest of kafka topics - I get following erro message:

datahub.ingestion.source.kafka:576} - Failed to fetch config details due to error Expected non-empty list of ConfigResource objects.

And Im using following ingestion script:

    type: kafka
    config:
        platform_instance: dev_kafka
        env: DEV
        connection:
            consumer_config:
                security.protocol: SASL_SSL
                sasl.mechanism: PLAIN
                sasl.username: ${USER_KAFKA_DEV}
                sasl.password: '${PW_KAFKA_DEV}'
                ssl.ca.location: "catrust.pem"
            bootstrap: 'broker1.broker1.my.broker:9093'
            schema_registry_url: '<https://broker1.my.broker:8081>'
            schema_registry_config:
                basic.auth.user.info: '${USER_KAFKA_DEV}:${PW_KAFKA_DEV}'
        enable_meta_mapping: true

        #schema
        topic_patterns:
            #allow: ['.*']
            allow: ['myTopic']
            deny: ['_confluent.*']

# Recipe sink configuration.
sink:
    type: "datahub-rest"
    config:
        server: "<http://localhost:8080>"```

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses! We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic and to help us improve the assistant!

To help you more effectively with the Kafka ingestion issue, could you please provide the following additional information?

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g., 0.12.0)?
  3. What data source(s) are you integrating with DataHub? (e.g., BigQuery)?

Based on the error message Failed to fetch config details due to error Expected non-empty list of ConfigResource objects, it seems there might be an issue with the configuration of your Kafka source. This error typically indicates that the Kafka client is not receiving the expected configuration details from the Kafka broker.

Here are some steps to troubleshoot and resolve this issue:

  1. Verify Kafka Configuration:
    Ensure that your Kafka configuration is correctly set up. This includes the bootstrap servers and any necessary security configurations.

  2. Check Network Connectivity:
    Confirm that the DataHub instance can reach the Kafka brokers. You can use tools like telnet or nc to verify connectivity.

  3. Inspect Kafka Broker Logs:
    Check the logs of your Kafka broker for any errors or warnings that might indicate why the configuration details are not being fetched.

  4. Review DataHub Logs:
    Look at the detailed logs from DataHub to get more context on the error. This can provide insights into what might be going wrong.

  5. Configuration Example:
    Here is an example configuration for Kafka ingestion in DataHub:

    source:
      type: "kafka"
      config:
        connection:
          bootstrap: "broker1:9092,broker2:9092"
          consumer_config:
            security.protocol: "PLAINTEXT"
        schema_registry_url: "http://localhost:8081"
        group_id: "your_group_id"
    

Would you like more detailed step-by-step guidance on any of these points?

Sources:

HI,

  1. Are you using the UI or CLI for ingestion? -->CLI
  2. Which DataHub version are you using? (e.g., 0.12.0)? –>
    DataHub CLI version: 0.13.3.3
    Models: bundled
    Python version: 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0]
  3. What data source(s) are you integrating with DataHub? Kafka Confluent

can you probably provide me a working ingestion script for kafka confluent as an example including topic_patterns examoples? Thanks