Components in DataHub connecting to Kafka and resolving issues with user action configurations

Original Slack Thread

<@U06TM7M40RK> which components from datahub connect to kafka?

DataHub requires Kafka to operate, and several components within DataHub connect to Kafka. According to the <https://datahubproject.io/docs/0.14.0/how/kafka-config|DataHub Kafka Configuration Guide>, the following components require a connection to Kafka:

  1. Metadata Service (datahub-gms container): This is the core service that manages metadata within DataHub.
  2. MCE Consumer Job (datahub-mce-consumer container): This is used if standalone consumers are deployed.
  3. MAE Consumer Job (datahub-mae-consumer container): This is used if product analytics are enabled.
  4. DataHub Frontend: The user interface component that also requires Kafka for certain operations.

These components use Kafka as a durable log to store inbound requests to update the Metadata Graph or as a change log detailing updates that have been made to the Metadata Graph.

Would you like more detailed information on how these components interact with Kafka or how to configure them?

Sources:

<@U06TM7M40RK> No user action configurations found. Not starting user actions.

The message “No user action configurations found. Not starting user actions.” typically indicates that the DataHub Actions service did not find any user-defined action configurations to execute. This can happen if the configuration files for actions are not correctly set up or not accessible by the service.

Here are some steps to troubleshoot and resolve this issue:

  1. Check Configuration Files: Ensure that the configuration files for your actions are correctly placed in the expected directory. According to the <Microsoft Teams | DataHub Actions documentation>, these files should be located in a directory that the DataHub Actions service can access, such as /etc/datahub/actions/conf/.

  2. Verify Permissions: Make sure that the DataHub Actions service has the necessary permissions to read the configuration files. There have been instances where permission issues prevented access to the configuration files, as noted in a <Issues · datahub-project/datahub · GitHub issue>.

  3. Configuration Content: Double-check the content of your configuration files to ensure they are correctly formatted and contain valid action definitions. For example, a Teams action configuration should specify the webhook_url and other necessary parameters.

  4. Docker and Kubernetes Setup: If you are running DataHub in a Docker or Kubernetes environment, ensure that the configuration files are correctly mounted into the container. This might involve setting up volume mounts in your Docker Compose or Kubernetes configuration, as discussed in a <Slack thread>.

  5. Logs for More Details: Check the logs of the DataHub Actions service for any additional error messages or warnings that might provide more context about why the configurations were not found.

Would you like more detailed guidance on any of these steps?

Sources: