Need to reset schemas for a specific topic in DataHub due to serialization errors

Original Slack Thread

Hey guys
question - we use datahub 0.11.0, tried to upgrade to 0.13.0 but something went wrong and we rolled it back to 11 version
now I see log of errors in datahub-gms pod like

Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 6
Caused by: java.lang.ArrayIndexOutOfBoundsException: null```
looks like some mesh with schemas, tried to find avro schema from current v0.11 tag and publish it, but didn't help, processed some messages and failed again

Is it possible to remove all schemas for this topic and re-create somehow? or I noticed that we can disable schema-registry somehow, maybe by env var
SCHEMA_REGISTRY_TYPE: INTERNAL

or at least ignore broken messages

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

The cause may be a misconfigured schema registry, however when switching schema registries you would also have to clear the topics. The easiest way to reset the topics is to delete the datahub topics and then perform a helm install/upgrade which will re-run the kafka-setup⁣ job to create them again.

If no customization was done to the topic names, the topics are called the following

DataHubUsageEvent_v1
FailedMetadataChangeEvent_v4
FailedMetadataChangeProposal_v1
MetadataAuditEvent_v4
MetadataChangeEvent_v4
MetadataChangeLog_Timeseriesv1
MetadataChangeLog_Versionedv1
MetadataChangeProposal_v1
PlatformEvent_v1