Kafka ingestion config -
How to specify Kerberos settings and how to specify multiple bootstrap servers
I tried comma separated values , but I get error
Boostrap host contains bad characters , found
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a or so other folks in the Community know if you’re still looking for help on the topic!
To help you more effectively, could you please provide the following additional information?
- Are you using the UI or CLI for ingestion?
- Which DataHub version are you using?
- Could you confirm the data source you are integrating with DataHub? In this case, it seems to be Kafka.
Based on the information provided so far, here is how you can specify Kerberos settings and multiple bootstrap servers for Kafka ingestion in DataHub:
Configuration Example
Below is an example of a YAML configuration for Kafka ingestion with Kerberos authentication and multiple bootstrap servers:
source:
type: "kafka"
config:
platform_instance: "YOUR_CLUSTER_ID"
connection:
bootstrap: "broker1:9092,broker2:9092,broker3:9092"
consumer_config:
security.protocol: "SASL_SSL"
sasl.mechanism: "GSSAPI"
sasl.kerberos.service.name: "kafka"
sasl.kerberos.principal: "YOUR_PRINCIPAL"
sasl.kerberos.keytab: "/path/to/your.keytab"
schema_registry_url: "YOUR_SCHEMA_REGISTRY_URL"
group_id: "YOUR_GROUP_ID"
Key Points:
- Bootstrap Servers: You can specify multiple bootstrap servers by separating them with commas in the
bootstrap
parameter. - Kerberos Settings: The
consumer_config
section includes the necessary Kerberos settings such assecurity.protocol
,sasl.mechanism
,sasl.kerberos.service.name
,sasl.kerberos.principal
, andsasl.kerberos.keytab
.
Troubleshooting Steps
- Verify Kerberos Configuration: Ensure that the Kerberos principal and keytab file are correctly configured and accessible from your environment.
- Network Connectivity: Confirm that the Kafka brokers are reachable from your local notebook.
- Kafka Broker Logs: Check the logs of your Kafka broker for any errors or warnings.
- Kerberos Tickets: Ensure that your Kerberos tickets are valid and not expired.
Example Command for Checking Connectivity
You can use the following command to check if you can reach the Kafka brokers from your local machine:
nc -zv broker1 9092
nc -zv broker2 9092
nc -zv broker3 9092
Would you like more detailed step-by-step guidance on any of these points?
Sources:
- <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/docs/sources/kafka/kafka.md|Kafka Ingestion Source Configuration>
- <https://github.com/datahub-project/datahub/blob/master/docs/how/kafka-config.md|Configuring Kafka in DataHub> 0 button 1 button Hint: Mention <@U06TM7M40RK> in the thread for followups.
Multiple bootstrap servers did not work
bootstrap: “srrver1, server2”