Resolving OIDC Authentication Issues in DataHub

Original Slack Thread

<@U06TM7M40RK>
I want to do OIDC Authentication in Datahub but after configuring

Required Configuration Values:

AUTH_OIDC_ENABLED=true
AUTH_OIDC_CLIENT_ID=your-client-id
AUTH_OIDC_CLIENT_SECRET=your-client-secret
AUTH_OIDC_DISCOVERY_URI=your-provider-discovery-url
AUTH_OIDC_BASE_URL=your-datahub-url
the above envs, I am getting

ERROR c.AuthenticationController - Caught exception while atte mpting to redirect to SSO identity provider! It’s likely that SSO integration is mis-configured
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderExc eption: unable to find valid certification path to requested target
WARN a.s.o.custom.CustomOidcAuthenticator - Retry attempt 1 o f 3 failed
org.pac4j.core.exception.TechnicalException: Error getting URL resource
how can i resolve the issue

The error message you are seeing—sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target—indicates that the DataHub frontend is unable to validate the SSL certificate presented by your OIDC provider (e.g., Keycloak, Okta, Azure AD). This is a common issue when the OIDC provider uses a self-signed certificate or a certificate signed by an internal Certificate Authority (CA) that is not trusted by the Java runtime in your DataHub container.

How to resolve this issue:

1. Add the OIDC Provider’s CA Certificate to the Truststore

You need to ensure that the Java truststore used by the DataHub frontend contains the CA certificate that signed your OIDC provider’s certificate.

  • For Kubernetes/Helm deployments:
    The recommended approach is to mount a ConfigMap or Secret containing your trusted CA certificates into the DataHub frontend pod, and then configure the Java truststore to use it.
    According to <Slack Slack thread>, you can:

    1. Build a custom image with the CA certificate added to the Java truststore, or
    2. Mount a ConfigMap/Secret with the CA certs and point the frontend to use it.
  • Set the following environment variables in your deployment:

    - name: SSL_TRUSTSTORE_FILE
      value: /path/to/your/truststore.jks
    - name: SSL_TRUSTSTORE_TYPE
      value: jks
    - name: SSL_TRUSTSTORE_PASSWORD
      value: yourpassword
    

    (See <https://github.com/datahub-project/datahub/issues/7056|this GitHub issue> for an example.)

  • For Docker Compose:
    You can similarly mount the truststore file and set the above environment variables in your docker-compose.yml.

2. Verify the Truststore

  • Make sure the truststore actually contains the CA certificate for your OIDC provider. You can check this with:
    keytool -list -keystore /path/to/your/truststore.jks
    
  • The default Java truststore password is usually changeit.

3. Restart the DataHub Frontend

After updating the truststore and environment variables, restart your DataHub frontend service to apply the changes.

4. Additional Notes

  • Disabling SSL verification is not recommended for production environments due to security risks.
  • If you are using a public CA, ensure your Java runtime is up to date and includes the latest root certificates.

Would you like a step-by-step guide for creating and mounting a custom truststore in your specific deployment environment (Kubernetes, Docker Compose, etc.)?

Sources:

<@U06TM7M40RK>
After adding
SSL_TRUSTSTORE_FILE=path/to/truststore.jks (e.g. /certificates) SSL_TRUSTSTORE_TYPE=jks SSL_TRUSTSTORE_PASSWORD=MyTruststorePassword

getting the below error
2025-05-02 07:17:57,258 [main] INFO play.api.Play - Application started (Prod) (no global state)
2025-05-02 07:17:57,375 [main] WARN play.api.Configuration - datahub-frontend/conf/application.conf: 45: akka.max-header-value-length is deprecated, use max-header-size instead
2025-05-02 07:17:57,499 [main] INFO server.CustomAkkaHttpServer - Setting max header count to: 64
2025-05-02 07:17:57,542 [kafka-producer-network-thread | datahub-frontend] INFO org.apache.kafka.clients.Metadata - [Producer clientId=datahub-frontend] Cluster ID: c0sJ1kGkTJWcYXN6B6wUIg
2025-05-02 07:17:57,809 [main] INFO play.core.server.AkkaHttpServer - Listening for HTTP on /[0:0:0:0:0:0:0:0]:9002
2025-05-02 07:18:52,418 [proxyClient-akka.actor.default-dispatcher-5] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2025-05-02 07:20:45,699 [application-akka.actor.default-dispatcher-9] INFO o.p.core.adapter.FrameworkAdapter - Using Play framework adapter
2025-05-02 07:20:45,830 [application-akka.actor.default-dispatcher-7] ERROR controllers.SsoCallbackController - Caught exception while attempting to handle SSO callback! It’s likely that SSO integration is mis-configured.
java.util.concurrent.CompletionException: org.pac4j.oidc.exceptions.OidcMissingSessionStateException: State cannot be determined
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
at play.core.j.HttpExecutionContext.$anonfun$execute$1(HttpExecutionContext.scala:64)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: org.pac4j.oidc.exceptions.OidcMissingSessionStateException: State cannot be determined
at org.pac4j.oidc.credentials.extractor.OidcCredentialsExtractor.lambda$extract$0(OidcCredentialsExtractor.java:140)
at java.base/java.util.Optional.orElseThrow(Optional.java:403)
at org.pac4j.oidc.credentials.extractor.OidcCredentialsExtractor.extract(OidcCredentialsExtractor.java:140)
at org.pac4j.core.client.BaseClient.getCredentials(BaseClient.java:78)
at auth.sso.oidc.OidcCallbackLogic.superPerform(OidcCallbackLogic.java:171)
at auth.sso.oidc.OidcCallbackLogic.perform(OidcCallbackLogic.java:128)
at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:138)
at controllers.SsoCallbackController.lambda$callback$0(SsoCallbackController.java:74)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
… 8 common frames omitted
2025-05-02 07:20:54,380 [application-akka.actor.default-dispatcher-19] INFO c.AuthenticationController - Login successful for user: datahub, urn: urn:li:corpuser:datahub
2025-05-02 07:20:54,380 [application-akka.actor.default-dispatcher-19] INFO client.AuthServiceClient - Requesting session token for user: datahub
2025-05-02 07:20:54,401 [application-akka.actor.default-dispatcher-19] INFO client.AuthServiceClient - Successfully received session token for user: datahub
2025-05-02 07:20:55,775 [application-akka.actor.default-dispatcher-19] ERROR controllers.SsoCallbackController - Caught exception while attempting to handle SSO callback! It’s likely that SSO integration is mis-configured.
java.util.concurrent.CompletionException: org.pac4j.oidc.exceptions.OidcMissingSessionStateException: State cannot be determined
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
at play.core.j.HttpExecutionContext.$anonfun$execute$1(HttpExecutionContext.scala:64)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: org.pac4j.oidc.exceptions.OidcMissingSessionStateException: State cannot be determined
at org.pac4j.oidc.credentials.extractor.OidcCredentialsExtractor.lambda$extract$0(OidcCredentialsExtractor.java:140)
at java.base/java.util.Optional.orElseThrow(Optional.java:403)
at org.pac4j.oidc.credentials.extractor.OidcCredentialsExtractor.extract(OidcCredentialsExtractor.java:140)
at org.pac4j.core.client.BaseClient.getCredentials(BaseClient.java:78)
at auth.sso.oidc.OidcCallbackLogic.superPerform(OidcCallbackLogic.java:171)
at auth.sso.oidc.OidcCallbackLogic.perform(OidcCallbackLogic.java:128)
at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:138)
at controllers.SsoCallbackController.lambda$callback$0(SsoCallbackController.java:74)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)

The error OidcMissingSessionStateException: State cannot be determined after setting up your truststore means that the DataHub frontend is now able to reach your OIDC provider (the SSL issue is resolved), but the OIDC authentication flow is failing because the “state” parameter is missing or not being persisted between the authentication request and the callback.

This is a common OIDC integration issue and is not related to the truststore or SSL configuration, but rather to session management in your DataHub frontend.


Why does this happen?

The “state” parameter is used by OIDC to prevent CSRF attacks and to maintain the authentication session. If the state is missing or cannot be determined, it usually means:

  • The session store is not persisting session data correctly.
  • Cookies are not being set or sent properly (e.g., due to misconfiguration, domain mismatch, or secure flag issues).
  • There is a proxy or load balancer interfering with cookies or session stickiness.
  • The frontend is running in multiple pods/containers without a shared session store.

How to resolve

1. Check Session Store Configuration

DataHub uses the Play Framework’s session store. If you are running multiple frontend instances (pods/containers), you must use a shared session store (e.g., Redis, Memcached). If you use the default in-memory store, sessions will not persist across pods.

  • In your application.conf (or via environment variables), ensure you are using a shared session store.
  • For simple setups, the default play.cache store is used, but for production, you should use Redis or another distributed cache.

Example application.conf:

play {
  cache {
    defaultCache = "play"
    bindCaches = ["session-cache"]
  }
}

session {
  store = "play.cache"
  cookieName = "PLAY_SESSION"
  secure = true
  httpOnly = true
}

You can also set these as environment variables in Docker Compose or Kubernetes:

- PLAY_SESSION_STORE=play.cache
- PLAY_SESSION_COOKIE_NAME=PLAY_SESSION
- PLAY_SESSION_SECURE=true
- PLAY_SESSION_HTTP_ONLY=true

(https://github.com/datahub-project/datahub/blob/master/docs/authentication/guides/sso/configure-oidc-react.md|Reference)


2. Check Cookie Settings

  • Make sure your DataHub frontend is served over HTTPS and the secure flag is set on cookies.
  • Ensure the domain and path for cookies are correct and match your DataHub base URL.
  • If you are behind a proxy or load balancer, make sure it is not stripping or modifying cookies.

3. Single Pod/Container Test

If you are running multiple frontend pods, try scaling down to a single pod/container to see if the issue persists. If it works with one pod, but not with multiple, you need a shared session store.


4. Clear Browser Cookies

Sometimes, old or invalid cookies can cause this issue. Clear your browser cookies for your DataHub domain and try again.


5. Enable Verbose Logging

Set AUTH_VERBOSE_LOGGING=true in your environment to get more detailed logs for debugging.


Would you like step-by-step instructions for configuring a distributed session store (e.g., Redis) for DataHub, or more help with debugging your session/cookie setup?


Sources: