Troubleshooting LDAP Issue in DataHub v0.13.2

Original Slack Thread

Hi there! I think we have found a bug. We are using LDAP for employees to log into DataHub (v0.13.2). All employees (new and old) can log in normally and access DataHub, but when a new employee logs in DataHub is not creating a new user entity for the employee as it did in prior versions. If the employee clicks on “Your Profile” the page will show “Not Found” <@U06TM7M40RK>

It sounds like you’re encountering an issue where new employees logging in via LDAP are not having their user entities created in DataHub, which results in a “Not Found” error when they try to access their profile. This issue has been discussed in various contexts within the DataHub community.

Potential Causes and Solutions

  1. LDAP Ingestion Configuration:

    • Ensure that your LDAP ingestion is correctly configured to create user entities. According to <Slack Slack discussion>, users need to be ingested from the LDAP server using the LDAP source. This process provisions the users and groups in DataHub.
  2. User Entity Creation:

    • There might be an issue with the ingestion pipeline or the way new users are being handled. As per <Issues · datahub-project/datahub · GitHub GitHub issue>, the frontend might assume the presence of certain aspects like CorpUserEditableInfo. Ensure that your ingestion process includes all necessary aspects.
  3. Stateful Ingestion:

    • If you are using stateful ingestion, there might be issues related to checkpointing and soft deletes. According to <Issues · datahub-project/datahub · GitHub GitHub issue>, there can be problems if the ingestion process is interrupted. Ensure that your ingestion process completes successfully and that checkpoints are correctly managed.

Steps to Diagnose and Fix

  1. Check LDAP Ingestion Logs:

    • Verify the logs for the LDAP ingestion process to ensure that new users are being ingested correctly. Look for any errors or warnings that might indicate issues.
  2. Verify User Entity Creation:

    • Manually check if the user entities are being created in the DataHub database. You can use the DataHub REST API to query for user entities and verify their presence.
  3. Update Ingestion Configuration:

    • Ensure that your ingestion configuration includes all necessary aspects for user entities. You might need to update your ingestion scripts to include CorpUserEditableInfo and other required aspects.
  4. Test with a Known Good Configuration:

    • If possible, test with a known good configuration from a previous version to see if the issue persists. This can help identify if the problem is specific to the new version.

Would you like more detailed steps on any of these points?

Sources:

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

To address the issue where new employees logging in via LDAP are not having user entities created in DataHub v0.13.2, we can look into a few potential causes and solutions based on the available data sources.

Potential Causes and Solutions

  1. LDAP Query Configuration:

    • Ensure that the LDAP query configuration is correctly set up to include both users and groups. A misconfiguration in the LDAP query filter can lead to only users being picked up and not groups, or vice versa. For example, removing a filter clause like (uid=*) might help in reading both users and groups (https://datahubspace.slack.com/archives/C029A3M079U/p1716820503.276539|source).
  2. Entity Hydration Issues:

  3. Just-in-Time (JIT) Provisioning:

Steps to Debug

  1. Check LDAP Configuration:

    • Ensure that your LDAP configuration is correctly set up to include all necessary user and group information.
  2. Review Logs:

    • Check the logs for any errors related to entity hydration or user creation. Look for messages similar to “Could not find entity for urn” or “No matches for urn”.
  3. Reindex Elasticsearch:

Example Configuration

Here is an example of how you might configure the LDAP ingestion to ensure that both users and groups are included:

source:
  type: "ldap"
  config:
    base_dn: "dc=example,dc=com"
    filter: "(objectClass=*)"
    page_size: 500
    stateful_ingestion:
      enabled: true

sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"

Would you like more detailed steps or further assistance on any specific part of this process?

Sources:

hey Pablo! hmm this is interesting… and you say that in previous versions this was not a problem for you all? and the user can still login properly, it’s just that it appears like they don’t exist in the database?

Exactly! Everyone can login properly but new users are not created

Hi Chris! Exactly, after login the record fot this user is not create in the database (but users with invitation link are created). I don’t see any error in logs for gms or frontend container. To clarify the installation, we are deploying docker images on prem using 0.13.2 version.

Also, I did a quick check using docker quickstart with version 0.13.3rc1 and even though it was not connected to ldap, adding the user manually to user.props reproduced the same behaviour: could login but then profile page threw a 404 complaining about urn:li:corpuser didn’t exist. Thank you! <@U03BEML16LB>