Custom Implementation and Docker Image Deployment Challenges in Upgrading DataHub

Original Slack Thread

Hi, guys! Hope you’re having a great day!

I’ve moved recently to the Data Engineering team and was responsible from upgrading DataHub 0.8.43 to 0.10.4 (yes, it’s been a while since we upgraded it :exploding_head:) and found a lot hardship a long the way, but managed to deliver it. But there’s a issue on our side on how things are done today at deployment: the images come built already from the docker repository (the front end get a .zip from somewhere, datahub-upgrade get its jar already built on the docker file, etc).

Because we have a custom implementation (we added an new “metric” entity similar to dataset), we can’t use these ready images because they don’t always understand this customisation. Can anyone point me to some reading material or code bases that we can use as a reference?

Hello !
My case is similar to yours, so I am sharing some links in hope it will spare you a bit of the time it took me ^^ .

For the local build command of the gms and front-end binaries/zip we used this documentation : https://datahubproject.io/docs/developers/

And for the build of the Dockerfiles we checked the Datahub Github actions that handle it, for example : https://github.com/datahub-project/datahub/actions/runs/6296887808/workflow

Hope it helped ^^.