Troubleshooting dataPlatform load failure on Datahub GMS

Original Slack Thread

Hello, I am running Datahub v0.11 on Mysql.
I accidentally sent lineage data with the ‘wrong urn’ through Airflow today.
The urn was supposed to be of the form ‘urn:li:dataset:(urn:li:dataPlatform:hive,table_name,PROD)’, but I ended up capitalizing ‘hive’, meaning it was of the form ‘urn:li:dataset:(urn:li:dataPlatform:HIVE,table_name,PROD)’.
After this, the datahub main page is not loading the dataPlatform properly.

This is datagub GMS error log when I go to main page.

	at com.linkedin.datahub.graphql.types.dataplatform.DataPlatformType.batchLoad(DataPlatformType.java:59)
	at com.linkedin.datahub.graphql.GmsGraphQLEngine.lambda$createDataLoader$213(GmsGraphQLEngine.java:1813)
	... 17 common frames omitted
Caused by: java.lang.IllegalStateException: Duplicate key EntityAspectIdentifier(urn=urn:li:dataPlatform:hive, aspect=dataPlatformInfo, version=0) (attempted merging values com.linkedin.metadata.entity.EntityAspect@b79d8e5d and com.linkedin.metadata.entity.EntityAspect@b79d8e5d)
	at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133)
	at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180)
	at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
	at java.base/java.util.ArrayList$Itr.forEachRemaining(ArrayList.java:1033)
	at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
	at com.linkedin.metadata.entity.ebean.EbeanAspectDao.batchGet(EbeanAspectDao.java:264)
	at com.linkedin.metadata.entity.EntityServiceImpl.getEnvelopedAspects(EntityServiceImpl.java:1748)
	at com.linkedin.metadata.entity.EntityServiceImpl.getCorrespondingAspects(EntityServiceImpl.java:391)
	at com.linkedin.metadata.entity.EntityServiceImpl.getLatestEnvelopedAspects(EntityServiceImpl.java:345)
	at com.linkedin.metadata.entity.EntityServiceImpl.getEntitiesV2(EntityServiceImpl.java:299)
	at com.linkedin.metadata.client.JavaEntityClient.batchGetV2(JavaEntityClient.java:124)
	at com.linkedin.datahub.graphql.types.dataplatform.DataPlatformType.batchLoad(DataPlatformType.java:44)
	... 18 common frames omitted```
After seeing that error message, I suspected that a dataPlatform named 'HIVE' had been created and that this was causing a conflict.. I checked the mysql table and dataplatformindex_v2 in elasticsearch, but 'urn:li:dataPlatform:HIVE' did not exist.
Is there anything else I can do here? Thank you.

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

Please follow this thread to resolve the issue
https://datahubspace.slack.com/archives/C029A3M079U/p1676616769274879

• <@U0445MUD81W> Problem solved. Thank you!!