Troubleshooting DataHub UI Ingestion Connectivity Issues

Original Slack Thread

<@U06TM7M40RK> I am trying to run an ingestion from UI but I cannot see it run in the UI even though I get the message that successfully created ingestion source. Why is this happening?

It seems like your ingestion request is being submitted but not executed, which can be due to several reasons. Here are some steps to troubleshoot the issue:

  1. Check datahub-actions Container: The datahub-actions container is responsible for executing ingestion requests. If this container is down or not functioning correctly, it could cause the ingestion request to fail. You can verify the health of the container using docker ps or by checking the logs of the datahub-actions container. According to the <https://github.com/datahub-project/datahub/blob/master/docs/ui-ingestion.md|DataHub UI Ingestion documentation>, if you see ‘N/A’ and the ingestion run state never changes to ‘Running’, this may mean that your executor (datahub-actions) container is down.

  2. Inspect Logs: Check the logs of the datahub-actions container for any errors or warnings. This can provide more context about what might be going wrong. You can do this by running:

    docker logs <container-id>
    

    Replace <container-id> with the actual ID of the datahub-actions container.

  3. Verify Configuration: Ensure that your ingestion configuration is correct. Double-check fields like account_id, username, password, role, and warehouse if applicable.

  4. Network Connectivity: Ensure that your DataHub instance can reach the data source. Network issues can cause the connection to fail. Verify DNS resolution and network policies that might be blocking the connection.

  5. Resource Allocation: Ensure that your environment has sufficient resources allocated to the DataHub components, especially the datahub-actions pod.

Would you like more detailed steps on any of these troubleshooting points or further assistance on a specific part of the process?

Sources:

<@U06TM7M40RK> This seems to be the error when I check datahub-actions logs. urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host=‘datahub-gms’, port=8080): Max retries exceeded with url: /aspects?action=ingestProposal (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7f94086385e0>: Failed to establish a new connection: [Errno -2] Name or service not known’))

The error message you’re encountering indicates that the datahub-actions container is unable to connect to the DataHub GMS (Graph Metadata Service) at datahub-gms:8080. This is often due to network connectivity issues or incorrect configuration. Here are some steps to resolve this issue:

  1. Check Hostname Resolution: The error suggests that the hostname datahub-gms cannot be resolved. Ensure that the hostname is correctly configured and that your network setup allows for proper DNS resolution. If you’re running DataHub in a Docker environment, make sure that the Docker network is set up correctly and that the datahub-gms service is accessible from the datahub-actions container.

  2. Verify Service Availability: Ensure that the DataHub GMS service is running and accessible. You can check this by trying to access the GMS service from within the datahub-actions container. You can do this by executing a shell in the container and using tools like curl or ping to test connectivity:

    docker exec -it <datahub-actions-container-id> /bin/sh
    curl http://datahub-gms:8080/aspects
    
  3. Update Configuration: If you’re running DataHub outside of Docker or in a different network setup, you might need to update the configuration to use the correct hostname or IP address. For example, if you’re running the ingestion from your local machine, you might need to use localhost:8080 instead of datahub-gms:8080. According to the <Issues · datahub-project/datahub · GitHub GitHub issue>, the sink configuration should point to localhost:8080 if the ingestion is running outside of Docker.

  4. Check Network Policies: Ensure that there are no network policies or firewalls blocking the connection between the datahub-actions container and the GMS service.

  5. Restart Services: Sometimes, simply restarting the services can resolve transient network issues. Try restarting the DataHub services and see if the issue persists.

Would you like more detailed guidance on any of these steps or further assistance with your setup?

Sources:

<@U06TM7M40RK> While ingesting from cli, i had to specify a token in the sink config for the ingestion to work. But while ingesting from UI, I followed the onscreen prompts and there was no prompt that asked for token, so I did not provide it. Is that what’s causing the above error?

When ingesting data using the DataHub UI, the system is designed to handle authentication internally, and you typically do not need to manually provide a token. The UI should automatically manage the authentication process, including the use of tokens, as part of its internal configuration. This is different from CLI-based ingestion, where you need to explicitly specify a token in the sink configuration if metadata service authentication is enabled.

However, if you are encountering connectivity issues like the one you described earlier, it is unlikely that the absence of a token is the direct cause, since the UI should handle this aspect. Instead, the issue seems to be related to network connectivity or service configuration, as indicated by the MaxRetryError and inability to resolve the datahub-gms hostname.

Here are some additional considerations:

  1. UI Configuration: Ensure that the DataHub UI is correctly configured to communicate with the GMS service. This includes verifying that the environment variables and network settings are correctly set up to allow the UI to access the GMS.

  2. Service Health: Double-check that all DataHub services, including the GMS and datahub-actions, are running and healthy. Any service downtime or misconfiguration could lead to connectivity issues.

  3. Logs and Debugging: Continue to monitor the logs for any additional error messages or warnings that might provide more context about the issue. This can help identify whether the problem is related to authentication, network, or service configuration.

  4. Token Management: If you suspect that authentication might still be an issue, you can manually add a token to the UI ingestion configuration as a workaround. This involves editing the YAML configuration in the UI to include a sink section with the token, similar to how you would configure it for CLI ingestion. However, this should not be necessary if the UI is functioning as expected.

If you would like more detailed guidance on any of these points or further assistance with your setup, please let me know.

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><