<@U06TM7M40RK> use external postgresql during installation
To use an external PostgreSQL database during DataHub installation, you need to configure your deployment (whether using Helm on Kubernetes or Docker Compose) to point to your external PostgreSQL instance instead of deploying the default internal one.
Helm/Kubernetes Installation
-
Disable Internal PostgreSQL:
In yourvalues.yaml
for the prerequisites chart, set the internal PostgreSQL toenabled: false
. -
Configure External PostgreSQL Connection:
In your main DataHubvalues.yaml
, under theglobal.sql.datasource
section, provide the connection details for your external PostgreSQL:global: sql: datasource: host: "<EXTERNAL_PG_HOST>:5432" hostForpostgresqlClient: "<EXTERNAL_PG_HOST>" port: "5432" url: "jdbc:postgresql://<EXTERNAL_PG_HOST>:5432/datahub?user=<PG_USER>&password=<PG_PASSWORD>&sslmode=require" driver: "org.postgresql.Driver" username: "<PG_USER>" password: value: "<PG_PASSWORD>"
Replace
<EXTERNAL_PG_HOST>
,<PG_USER>
, and<PG_PASSWORD>
with your actual PostgreSQL host, username, and password. -
Apply the Configuration:
Deploy or upgrade DataHub with your updated values:helm upgrade --install datahub datahub/datahub --values values.yaml
This will configure DataHub to use your external PostgreSQL instance as its storage backend.
- (Optional) PostgreSQL Setup Job:
If you want DataHub to initialize the schema in your external database, ensure thepostgresqlSetupJob
is enabled and points to the same connection details. If your database is already initialized, you can disable this job.
References:
- <https://datahubproject.io/docs/0.14.0/deploy/azure#use-postgressql-for-the-storage-layer|DataHub Azure Deployment Docs>
- <Slack Chart Slack Discussion>
- <https://github.com/datahub-project/datahub/blob/master/docs/deploy/azure.md|Sample values.yaml for PostgreSQL>
Docker Compose Installation
If you are using Docker Compose, you can use the https://github.com/datahub-project/datahub/blob/master/docker/postgres/docker-compose.postgres.yml|docker-compose.postgres.yml as a reference. You will need to:
- Remove the internal PostgreSQL service from your compose file.
- Set the environment variables for GMS and other services to point to your external PostgreSQL instance.
References:
- <https://github.com/datahub-project/datahub/blob/master/docker/postgres/docker-compose.postgres.yml|Docker Compose Postgres Example>
- <Slack Discussion on Docker Compose with External DB>
Would you like a step-by-step example for a specific deployment method (Helm or Docker Compose), or do you need help with a particular cloud provider or environment?
Sources:
- <https://datahubproject.io/docs/0.14.0/deploy/azure#use-postgressql-for-the-storage-layer|DataHub Azure Deployment Docs>
- <Slack Chart Slack Discussion>
- <https://github.com/datahub-project/datahub/blob/master/docs/deploy/azure.md|Sample values.yaml for PostgreSQL>
- <Slack Discussion on Docker Compose with External DB>
0 button
0 button
Hint: Mention @runllm in the thread for followups.
Hi <@U08PVDCPJAG>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.