Impact of Deleting Data from Elasticsearch on DataHub and Process for Data Restoration

Original Slack Thread

all metadata that datahub stored in elasticsearch. i wanted to know if we delete the data from elasticsearch, what will be the impact in datahub?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

ES is the backend for all graphql and search queries. So if you delete everything in ES you wont find any assets in the UI.
You would need to restore the indices from the DB. There a special k8s job doing this.

So if i delete all data of an ES index, we can get back all the data with a job(restoreIndices) running. Am i correct?
what you mean by “indices from the DB”?

> So if i delete all data of an ES index, we can get back all the data with a job(restoreIndices) running. Am i correct?
correct, but please try this before with a DEV instance because currently theres a issue with this job :slightly_smiling_face:
> what you mean by “indices from the DB”?
I mean restoring the ES indices by loading the data from the SQL DB

Thanks <@U02AF5P6QDS>

<@U02AF5P6QDS> Currently, we deployed datahub in v0.9.2 but want to upgrade to v0.13.0. Is there changes in queries of ES? And how get back data from mysql db?

I am not sure if its a good plan to upgrade from 0.9.x directly to 0.13.x as there are larger changes done (see https://datahubproject.io/docs/how/updating-datahub) someone else needs to answer this.
normally every upgrade does the necessary migration steps in ES so you do not need to cleanup and restore by yourself.

ok

what about get back data from mysql db? Any documentation…