Deploying Datahub into Amazon EKS on Fargate: Challenges and Solutions

Original Slack Thread

Hi All, I’m deploying datahub into production with amazon EKS on fargate, can you please let me know if helm release to deploy prerequisites can be deployed to fargate as pods are not coming up and shows pending with below error

Pod not supported on Fargate: volumes not supported: elasticsearch-master not supported because: PVC elasticsearch-master-elasticsearch-master-0 not bound, invalid SecurityContext fields: Privileged

Fargate is not designed for using persistent volumes. All data stores like kafka, mysql, and elasticsearch all require the ability to persist data and cannot be run on Fargate. For those services mentioned, AWS offers managed solutions or use a managed node group which can use EBS as a persistent storage option.

thanks David, I deployed with EKS managed node and able to see the pod work with EBS volumes.

I now see pods are failing with readiness/liveness probe error, could you advice what am I doing wrong here

elasticsearch-master-0 -Readiness probe failed: Waiting for elasticsearch cluster to become ready (request params: “wait_for_status=yellow&timeout=1s” ) Cluster is not yet ready (request params: “wait_for_status=yellow&timeout=1s” )

prerequisites-mysql-0 - Startup probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure. mysqladmin: connect to server at ‘localhost’ failed error: ‘Can’t connect to local MySQL server through socket ‘/opt/bitnami/mysql/tmp/mysql.sock’ (2)’ Check that mysqld is running and that the socket: ‘/opt/bitnami/mysql/tmp/mysql.sock’ exists!

prerequisites-kafka-0- Readiness probe failed: dial tcp 10.1.44.183:9092: connect: connection refused

prerequisites-zookeeper-0 Readiness probe failed:
Liveness probe failed:

all the 4 prerequisites are failing with the error above

You’ll have to provide the container logs for the prerequisite pods, they will contain more information about why the services are failing to start.

Thanks David for the response again, I have fixed those errors by modifying the elastic search statefulset config for readiness for this initialDelaySeconds: 10 –> 30 and pods are up and healthy