Troubleshooting Kafka Ingestion with Kerberos Authentication

Original Slack Thread

Kafka ingestion - POC
Am trying to run a pipeline in my local notebook .
I don’t have access to slack from my desktop -
I have the below configuration settings
Source / type: Kafka
Config
Platform instance - I have the name of my cluster
Connection-
bootstrap - one of the servers from connection profile (Bootstrap server s)
Consumer config
Sec protocol : sasl ssl
Sasl mechanism: gssapi
sasl.kerberos.principle : user id
sasl.kerberos.service.name: esaas
http://group.id|group.id : group id
schema registry url - from connection details

Apologies I can’t paste the whole config , don’t have access to slack on work machine .

Pipeline run throws an exception
Transport error
Local : Broker transport failed
Anyone connected to Kafka cluster successfully and can share their yaml
I believe Kerberos is an issue as the Kerberos principle is not my Id. Not how to proceed

Thanks
Data hub version : 0.12.1

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

To help you more effectively, could you please provide the following additional information?

  1. Are you using the UI or CLI for ingestion?
  2. You mentioned the DataHub version is 0.12.1, which is helpful.
  3. Could you confirm the data source you are integrating with DataHub? In this case, it seems to be Kafka.

Based on the information provided, it seems you are facing a transport error related to Kafka ingestion with Kerberos authentication. Here are a few steps and configurations that might help resolve the issue:

Configuration Example
Here is an example of a YAML configuration for Kafka ingestion with Kerberos:

source:
  type: "kafka"
  config:
    platform_instance: "YOUR_CLUSTER_ID"
    connection:
      bootstrap: "YOUR_BOOTSTRAP_SERVER"
      consumer_config:
        security.protocol: "SASL_SSL"
        sasl.mechanism: "GSSAPI"
        sasl.kerberos.service.name: "esaas"
        sasl.kerberos.principal: "YOUR_PRINCIPAL"
        sasl.kerberos.keytab: "/path/to/your.keytab"
      schema_registry_url: "YOUR_SCHEMA_REGISTRY_URL"
    group_id: "YOUR_GROUP_ID"

Troubleshooting Steps

  1. Verify Kerberos Configuration: Ensure that the Kerberos principal and keytab file are correctly configured and accessible from your environment. The keytab file should be in the specified path.
  2. Network Connectivity: Confirm that the Kafka broker is reachable from your local notebook. You can use tools like telnet or nc to check connectivity to the Kafka port.
  3. Kafka Broker Logs: Check the logs of your Kafka broker for any errors or warnings that might indicate why the connection is failing.
  4. Kerberos Tickets: Ensure that your Kerberos tickets are valid and not expired. You can use the klist command to check your Kerberos tickets.

Example Command for Checking Connectivity
You can use the following command to check if you can reach the Kafka broker from your local machine:

nc -zv YOUR_BOOTSTRAP_SERVER 9092

Would you like more detailed step-by-step guidance on any of these points?

Sources:

Keytab not supported on windows and the principle I provided is not mine …