Understanding and Managing Cron Jobs Deployed by the DatahubUpgrade Component in k8s with Helm

Original Slack Thread

Hi Team - In our installation onto k8s using helm, I see the following cron jobs getting deployed becuase of enabling https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/values.yaml#L222|datahubUpgrade component. Is this expected?
If yes - may I know the responsibility of these cron jobs and the datahubUpgrade component.

  1. datahub-datahub-restore-indices-job-template
  2. datahub-datahub-cleanup-job-template

<@UV5UEC3LN> any ideas on this one?

Yes this is expected, these are utility jobs: https://datahubproject.io/docs/how/restore-indices/

They only execute if manually called and unless you need to run the utilities they can be ignored

but I see the https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/templates/datahub-upgrade/datahub-cleanup-job-template.yml#L13C3-L13C3|schedule to be every minute?

Hi, I am also facing issues related to datahub-datahub-restore-indices-job-template and datahub-datahub-cleanup-job-template cron jobs. They take up the entire cpu and memory of my GKE cluster and results in crashing Datahub UI. Any idea on how to stop this from happening ?

I have suspended both the jobs and deleted all the corresponding job batches running. Is it ok to keep them suspended ? What do we really need these jobs for?

Yeah you can keep them suspended, these are one-off jobs and should not be run regularly.