Investigating Disk Space Issue with MetadataChangeLog_Versioned_v1 in Datahub环境

user-1 · March 4, 2024, 4:37pm

Hi all, I am looking for advice on how to best investigate an issue we have with datahub in multiple environments. We are using datahub 0.10.5 deployed on k8s. In some of our environments, kafka started to run out of disk space. Disk usage seemed to be stable, but it started to increase suddenly, as you can see from the graph (the graph shows the available disk space, not disk usage).

Most of the disk space seems to be used by one particular topic MetadataChangeLog_Versioned_v1 . The topic only has a retention of 7days as per kafka default.
I am fairly new to Datahub but I couldn’t find anything that explains the sudden increase of messages in the topic.
Does anyone have an idea what might cause this or where should I look at? TIA

user-2 · March 4, 2024, 4:37pm

I am also facing the same issue…
I am using datahub 0.12.1 and helm chart 0.13.19

user-1 · March 4, 2024, 4:37pm

unfortunately we got no help on this issue. The only thing we did to mitigate the issue was to set a retention size on the topic to avoid using all the disk space.
But we don’t know what triggered this increase…

user-2 · March 4, 2024, 4:37pm

<@U067V13943C> In my case I’ve noticed that sometimes a scheduled ingestion run does not end. It keeps feeding the bulk request in a loop until I cancel the ingestion… This happened this weekend and it seems datahub produced more than 28k messages in MetadataChangelog_versioned_v1

Topic		Replies	Views
Diagnosing and Improving DataHub Kafka Topic Lag Post-Upgrade from 0.12.0 to 0.13.0 ingestion	3	47	May 27, 2024
Predicting and Managing Kafka Topic Volumes: `DataHubUpgradeHistory` and `MetadataChangeLog_Timeseries` getting-started	18	50	March 4, 2024
Managing Database Growth in DataHub with Retention Policies ingestion	2	22	November 18, 2024
Managing MySQL Disk Space and Log Files in DataHub ingestion	23	49	November 18, 2024
Investigating metadata ingestion issues and data loss in Datahub 0.10.5.5 with potential solutions troubleshoot	7	54	March 4, 2024

Investigating Disk Space Issue with MetadataChangeLog_Versioned_v1 in Datahub环境

Related topics