<@U06TM7M40RK> I’m getting the error: bootstrap: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 0ms in state APIVERSION_QUERY, 4 identical error(s) suppressed)
I’m using AWS MSK and this is my configuration for the acryl-datahub-actions:
enabled: true
image:
repository: acryldata/datahub-actions
tag: "v0.1.1"
# Add custom command / arguments to this job. Useful if you need a custom startup or shutdown script
# to run
# command: customCommand
# args: []
# mount the k8s secret as a volume in the container, each key name is mounted as a file on the mount path /etc/datahub/ingestion-secret-files
# ingestionSecretFiles:
# name: ${K8S_SECRET_NAME}
# defaultMode: "0444"
extraVolumes:
- configMap:
defaultMode: 292
name: datahub-executor
name: datahub-custom-actions-config
extraVolumeMounts:
- mountPath: /etc/datahub/actions/system/conf
name: datahub-custom-actions-config
configs:
executor.yaml: |
name: "ingestion_executor"
source:
type: "kafka"
config:
connection:
bootstrap: ${KAFKA_BOOTSTRAP_SERVER:-localhost:9092}
schema_registry_url: ${SCHEMA_REGISTRY_URL:-<http://localhost:8081>}
consumer_config:
security.protocol: ${KAFKA_PROPERTIES_SECURITY_PROTOCOL:-PLAINTEXT}
sasl.mechanism: ${KAFKA_PROPERTIES_SASL_MECHANISM:-PLAIN}
schema_registry_url: ${SCHEMA_REGISTRY_URL:-<http://localhost:8081>}
topic_routes:
mcl: ${METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME:-MetadataChangeLog_Versioned_v1} # Topic name for MetadataChangeLogEvent_v1 events.
mcl_timeseries: ${METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME:-MetadataChangeLog_Timeseries_v1} # Topic name for MetadataChangeLogEvent_v1 timeseries events.
pe: ${PLATFORM_EVENT_TOPIC_NAME:-PlatformEvent_v1} # Topic name for PlatformEvent_v1 events.
filter:
event_type: "MetadataChangeLogEvent_v1"
event:
entityType: "dataHubExecutionRequest"
changeType: "UPSERT"
aspectName:
- "dataHubExecutionRequestInput"
- "dataHubExecutionRequestSignal"
aspect:
value:
executorId: "${EXECUTOR_ID:-default}"
action:
type: "executor"
config:
executor_id: "${EXECUTOR_ID:-default}"
datahub:
server: "${DATAHUB_GMS_PROTOCOL:-http}://${DATAHUB_GMS_HOST:-localhost}:${DATAHUB_GMS_PORT:-8080}"
extra_headers:
Authorization: "Basic ${DATAHUB_SYSTEM_CLIENT_ID:-__datahub_system}:${DATAHUB_SYSTEM_CLIENT_SECRET:-JohnSnowKnowsNothing}"
resources:
limits:
memory: 50120Mi
requests:
cpu: 900m
memory: 1024Mi```
It seems like you are encountering an issue with connecting to a Kafka broker, which might be related to the security protocol configuration or the broker version. This error typically indicates that the Kafka client is unable to establish a connection due to incorrect security settings or an incompatible broker version.
Here are some steps you can take to troubleshoot and resolve this issue:
Verify Security Protocol Configuration: Ensure that the security protocol is correctly set. If you are using SSL, your configuration should look something like this:
Check Kafka Broker Version: Ensure that your Kafka broker version is not less than 0.10, as the error message suggests. You can verify the version by checking the Kafka broker logs or configuration.
Custom Executor Configuration: If you are using DataHub Actions, you might need to customize the executor.yaml file to include the correct Kafka configurations. Here is an example configuration:
I’m here to help with questions related to DataHub. Could you please provide more details or rephrase your question so that I can assist you better with your DataHub-related query? 0 button 0 button Hint: Mention @runllm in the thread for followups.
<@U06TM7M40RK> With the given config, I’m getting the error [2024-12-31 15:06:01,777] INFO {datahub.ingestion.source.git.git_import:56} - :hourglass_flowing_sand: Cloning repo 'git@gitlab.com:group/looker-dev-cluster.git', this can take some time... |
I have the topics been created in the kafkaSetupJob, so the connection works, but when I try to cereate a connection from the frontend to a Redshift database, the status of the connection is “Pending” and the datahub-acryl-datahub-actions log triggers the error
Based on the configuration you’ve provided, it seems like you are setting up Kafka with AWS MSK IAM authentication. However, there are a few things to verify and adjust to ensure that your setup is correct:
JAAS Configuration: Ensure that the sasl.jaas.config is correctly set. The configuration should include all necessary parameters, such as awsRoleArn and awsStsRegion, if required. Here’s an example:
Environment Variables: Ensure that all environment variables are correctly set in your KafkaSetupJob. The KAFKA_PROPERTIES_SASL_JAAS_CONFIG should match the one used in your springKafkaConfigurationOverrides.
Debugging: Since you have awsDebugCreds=true in your JAAS config, check the logs for any additional debug information that might indicate what is going wrong.
Network and IAM Role: Verify that the network settings allow the connection to the MSK cluster and that the IAM role associated with your Kubernetes service account has the necessary permissions to access the MSK cluster.
If you need more detailed guidance on any specific part of this process, please let me know.
<@U06TM7M40RK> in your reply number 4: 4. Debugging: Since you have awsDebugCreds=true in your JAAS config, check the logs for any additional debug information that might indicate what is going wrong.
What logs should I check?
It seems likely that you should check the logs related to your Kafka client and the DataHub components interacting with Kafka. Here are some specific logs you might want to examine:
Kafka Client Logs: These logs will provide information about the Kafka client’s connection attempts and any authentication issues. If you are running Kafka in a containerized environment, you can use commands like docker logs <container-id> or kubectl logs <pod-name> to access the logs.
DataHub Logs: Check the logs for the DataHub components that are interacting with Kafka, such as the datahub-gms, datahub-frontend, or any ingestion jobs. These logs can provide insights into any issues with Kafka topic creation or message consumption.
AWS MSK Logs: If you have access to AWS CloudWatch, you can check the logs for your MSK cluster. These logs can provide information about the broker’s status and any connection attempts from clients.
Application Logs: If you have configured logging in your application, check the logs for any errors or warnings related to Kafka connections. Look for messages related to SASL, IAM, or authentication.
Debug Information: Since you have awsDebugCreds=true in your JAAS config, the logs should include additional debug information about the AWS credentials being used. This can help identify any issues with the IAM role or permissions.
If you need more detailed guidance on how to access or interpret these logs, please let me know.
It seems likely that the issue you’re experiencing is related to the configuration or communication between the DataHub frontend and the Kafka setup, especially since the Kafka setup job is working as expected and creating the topics. Here are a few areas to investigate:
Frontend Configuration: Ensure that the DataHub frontend is correctly configured to communicate with Kafka. This includes verifying that the bootstrap.servers and other Kafka-related configurations are correctly set in the frontend’s environment variables or configuration files.
Network Connectivity: Check if there are any network issues preventing the DataHub frontend from reaching the Kafka brokers. This could include firewall rules, security groups, or network policies that might be blocking the connection.
Authentication and Authorization: Since you’re using AWS MSK IAM for authentication, ensure that the IAM roles and policies are correctly configured to allow the DataHub frontend to connect to Kafka. Verify that the necessary permissions are granted for the frontend to produce and consume messages.
Frontend Logs: Examine the logs of the DataHub frontend for any error messages or warnings related to Kafka connections. These logs can provide insights into what might be going wrong during the connection attempt.
Kafka Client Configuration: Double-check the Kafka client configuration in the DataHub frontend. Ensure that the sasl.jaas.config, sasl.mechanism, and other related settings are correctly specified.
If you need more detailed guidance on any specific part of this process, please let me know.
<@U06TM7M40RK> no errors in any other log so far, only in datahub-acryl-datahub-actions logs <http://amazonaws.com:9098/bootstrap|amazonaws.com:9098/bootstrap>: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 0ms in state APIVERSION_QUERY, 4 identical error(s) suppressed)
The error message you’re encountering indicates that there might be an issue with the security protocol configuration when the DataHub Actions component is trying to connect to the Kafka broker. Here are some steps to troubleshoot and resolve this issue:
Verify Security Protocol Configuration: Ensure that the security protocol is correctly set in your DataHub Actions configuration. Since you’re using AWS MSK with IAM, the configuration should look something like this:
Check Kafka Broker Version: Ensure that your Kafka broker version is not less than 0.10, as the error message suggests. This can be verified by checking the Kafka broker logs or configuration.
Network Connectivity: Verify that there are no network issues preventing the DataHub Actions component from reaching the Kafka broker. This includes checking firewall rules, security groups, and network policies.
IAM Role and Permissions: Ensure that the IAM role associated with your DataHub Actions component has the necessary permissions to access the MSK cluster. This includes permissions for kafka-cluster:Connect and other relevant actions.
DataHub Actions Logs: Since the error is appearing in the DataHub Actions logs, closely examine these logs for any additional context or error messages that might provide more insights into the issue.
If you need more detailed guidance on any specific part of this process, please let me know.
Given that you are using Kafka version 3.6.2, the broker version should not be an issue since it is well above the minimum required version of 0.10. Therefore, the problem is likely related to the security protocol configuration or network connectivity. Here are some steps to further troubleshoot the issue:
Security Protocol Configuration: Double-check the security protocol settings in your DataHub Actions configuration. Ensure that all necessary parameters are correctly set for AWS MSK IAM authentication. Here is a sample configuration:
Network Connectivity: Verify that there are no network issues preventing the DataHub Actions component from reaching the Kafka broker. This includes checking firewall rules, security groups, and network policies that might be blocking the connection.
IAM Role and Permissions: Ensure that the IAM role associated with your DataHub Actions component has the necessary permissions to access the MSK cluster. This includes permissions for kafka-cluster:Connect and other relevant actions.
DataHub Actions Logs: Since the error is appearing in the DataHub Actions logs, closely examine these logs for any additional context or error messages that might provide more insights into the issue.
If you need more detailed guidance on any specific part of this process, please let me know.
<@U06TM7M40RK> I’m finding this error in the broker logs: INFO [SocketServer listenerType=ZK_BROKER, nodeId=3] Failed authentication with /INTERNAL_IP (channelId=INTERNAL_IP-INTERNAL_IP-12683) (SSL handshake failed) (org.apache.kafka.common.network.Selector)
The error message you’re encountering in the broker logs indicates that there is an SSL handshake failure between the Kafka broker and the client. This typically points to issues with the SSL configuration. Here are some steps to troubleshoot and resolve this issue:
Verify SSL Certificates: Ensure that the SSL certificates used by both the Kafka broker and the client are valid and correctly configured. This includes checking the certificate chain, expiration dates, and ensuring that the certificates are trusted by both parties.
Check SSL Configuration: Review the SSL configuration on both the Kafka broker and the client. Ensure that the following settings are correctly specified:
ssl.keystore.location: Path to the keystore file.
ssl.keystore.password: Password for the keystore.
ssl.truststore.location: Path to the truststore file.
ssl.truststore.password: Password for the truststore.
Hostname Verification: Ensure that the hostname verification is correctly configured. If the broker’s certificate does not match the hostname, you might need to adjust the ssl.endpoint.identification.algorithm setting.
Protocol Compatibility: Ensure that both the client and the broker are using compatible SSL/TLS protocols. You can specify the protocol using the ssl.protocol setting.
Logs for More Details: Check both the client and broker logs for more detailed error messages that might provide additional context about the SSL handshake failure.
Network Configuration: Verify that there are no network issues, such as firewalls or security groups, that might be interfering with the SSL handshake.
If you need more detailed guidance on any specific part of this process, please let me know.
When using AWS MSK IAM with the SASL_SSL security protocol, the configuration typically focuses on the SASL mechanism for authentication rather than the SSL keystore and truststore settings. However, SSL is still used for encrypting the connection, so you may need to configure the truststore to ensure that the client trusts the broker’s certificate. Here’s a breakdown:
Truststore Configuration:
You generally need to configure the ssl.truststore.location and ssl.truststore.password to ensure that your client trusts the certificates presented by the Kafka brokers. This is necessary for establishing a secure SSL connection.
Keystore Configuration:
The ssl.keystore.location and ssl.keystore.password are typically used when mutual TLS (mTLS) is required, where both the client and server authenticate each other. For AWS MSK IAM, this is not usually required unless your specific setup mandates it.
SASL_SSL with AWS MSK IAM:
The primary focus is on configuring the SASL mechanism. Ensure that the sasl.jaas.config and sasl.client.callback.handler.class are correctly set for AWS MSK IAM, as these handle the authentication part.
Here’s a sample configuration for AWS MSK IAM with SASL_SSL: