Understanding the `datahub-upgrade` Command-Line Tool for Greenfield Deployment

Original Slack Thread

~Hey, hey - I am lacking some clarity on how the datahub-upgrade command-line tool needs to be invoked for a greenfield deployment. In particular, is it necessary to run the SystemUpdate command, which requires a Kafka producer to send an upgrade history event to the metadata service (not sure why?) mce/mae consumers and to backfill the v2 browse paths for datasets, dashboards etc. Or does the BuildIndices command suffice?~

I answered myself by introspecting the code.

Never mind, my findings:
• The metadata service will not start unless it can guarantee that any changes made to the metadata model are successfully applied. It does so by consuming an upgrade history event - the SystemUpdate command is responsible for generating that. Trying to bypass this functionality would only cause me headaches in the future
• Backfilling browse paths is apparently important - this results in a transaction against the database (not diving too deep into this)

Thanks for sharing!