Troubleshooting intermittent ingest failures with NullPointerException error in Datahub 10.5

user-1 · March 4, 2024, 4:30pm

Hi. I have a couple if ingests fail with this error

               'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
                        'message': 'java.lang.NullPointerException',
                        'status': 500,```
But what's strange is that I have several separate ingest jobs for separate schemas within the same Database, and some of them worked fine, some failed with this error.
Please advise on what to to
Datahub version 10.5. Full log attached![attachment](https://files.slack.com/files-pri/TUMKD5EGJ-F062PKFFE10/ingest_error.log?t=xoxe-973659184562-6705490291811-6708051934148-dd1595bd5f63266bc09e6166373c7a3c)

datahub_team · March 4, 2024, 4:30pm

Hey Nadia! Sorry for the delayed response here… were you able to get this resolved?

user-1 · March 4, 2024, 4:30pm

<@U0121TRV0FL> sometimes it still shows up, but usually it goes away if I rerun ingest job once or twice manually. so it kinda comes and goes:woman-shrugging:

datahub_team · March 4, 2024, 4:30pm

Hmm… very strange! Please let us know if you find a failure pattern!

user-1 · March 4, 2024, 4:30pm

Hi <@U0121TRV0FL> this error here stil shows up sometimes, and I noticed that it does cause issues for us since in produces some broken entities, Caould you please point at whoever can help me with this?
I’m attaching full GMS logs as well as ingest job logs.
It happens occasionally for some of the daily ingest jobs, they fail with this error maybe once in a week or two, sometimes twice in a row, sometimes only once, and then work succesfully again until the next fail. I haven’t been able to find a pattern. The time is scheduled and always the same.

We’re on Datahub 10.5 and are deployed via Kubernetes attachment

user-1 · March 4, 2024, 4:30pm

Hi <@U0121TRV0FL> can please you address this issue to someone from the tech side of the team? It is still happening, even with CLI set to 12.1

user-2 · March 4, 2024, 4:30pm

hey Nadia! so i see in your logs that this NPE is being raised by a section of code that should be updated now. you say that you still get this error when updating datahub to 12.1?

user-2 · March 4, 2024, 4:30pm

if so could you send the error logs from there as well? <@U03LYB2ESJ0>

user-1 · March 4, 2024, 4:30pm

<@U03BEML16LB> here you go, the fresh ones from today, CLI 12.1.1 attachment

user-2 · March 4, 2024, 4:30pm

<@UV5UEC3LN> can you make heads or tails of where or why this NPE could be happening? in 10.5 it looks like EbeanAspectDao.runInTransactionWithRetry(EbeanAspectDao.java:531) is the culprit where the exact location then is this like if (sqlState.equals("40001")) { https://github.com/datahub-project/datahub/blob/v0.10.5/metadata-io/src/main/java/com/linkedin/metadata/entity/ebean/EbeanAspectDao.java#L531C13-L531C44|here - however in 12.1 we still get an NPE but i’m not seeing a specific line being called out but it might be in a similar place (after calling _entityService.ingestProposal)

user-1 · March 4, 2024, 4:30pm

meanwhile I’try to set CLI to 0.12.1.4 and see if it helps

user-3 · March 4, 2024, 4:31pm

Do you have the GMS logs from the more recent run?

user-3 · March 4, 2024, 4:31pm

Also is 0.10.5 your client version or server version? If Client, what is your server version on the run?

user-3 · March 4, 2024, 4:31pm

Note: if it is the line that Chris called out, that line is not present in server version 0.12.1, so at minimum it would be a different error happening if you have modified both your server and client versions.

Topic		Replies	Views
Handling NPE Errors in DBT Ingestions and Snowflake Jobs in DataHub ingestion	31	113	March 4, 2024
Troubleshooting NullPointerException Error in Datahub Version 0.14.0 ingestion	3	8	February 3, 2025
Troubleshooting a DataHub Ingestion Failure ingestion	4	37	February 10, 2025
Troubleshooting DataHub Ingestion Issues ingestion	2	71	September 2, 2024
Troubleshooting errors when emitting metadata to DataHub GMS ingestion	1	93	May 13, 2024

Troubleshooting intermittent ingest failures with NullPointerException error in Datahub 10.5

Related topics