Hi Team - when I am trying to run elasticsearchSetupJob on our k8s environment, I get the following error
going to use default elastic headers
not using any prefix
datahub_analytics_enabled: true
>>> GET _opendistro/_ism/policies/datahub_usage_event_policy response code is 400
>>> failed to GET _opendistro/_ism/policies/datahub_usage_event_policy ! -> exiting
2023/09/14 18:13:15 Command exited with error: exit status 1```
Our ES installation is on AWS Open Search and here is my config
``` elasticsearchSetupJob:
enabled: true
image:
repository: image
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
extraInitContainers: []
podSecurityContext:
fsGroup: 1000
securityContext:
runAsUser: 1000
extraEnvs:
- name: USE_AWS_ELASTICSEARCH
value: "true"
- name: OPENSEARCH_USE_AWS_IAM_AUTH
value: "true"```
and the host configs are as follows.
``` elasticsearch:
host: "hostname"
port: "443"
skipcheck: "false"
insecure: "false"
useSSL: "true"
region: "us-west-2"```
Do I have to add anything else to my configs to bypass this error?
Looks like a permissions issue most likely since it is executing a GET there isn’t much about the request that can be invalid. The elasticsearch-setup job is expected to be able to create the policy for usage events which power the product’s analytics page. The elasticsearch-setup job requires higher permissions then the rest of the containers and doesn’t support IAM authentication. That flag OPENSEARCH_USE_AWS_IAM_AUTH is not applicable for the setup job. The most typical scenario in this case is users are controlling their ES instance outside of the setup job based on their own ES/OS management process. They typically disable the elasticsearch-setup job and perform the setup actions there manually or via their automation tools. Assuming you’ve created the elasticsearch user already then its just a matter of setting up the 3 resources in this section https://github.com/datahub-project/datahub/blob/master/docker/elasticsearch-setup/create-indices.sh#L113|here. This is only executed once and then the normal user is typically granted enough permissions to manage the other application indices with the prefix policy.
Alternatively, you can disable the analytics tracking entirely and not have the analytics metrics populated on the dashboard in product with DATAHUB_ANALYTICS_ENABLED=false
The prefix is by default an empty string “”, however if you are managing several datahub instances on the same OpenSearch cluster you can separate them by adding a prefix string. For example, instance_a and instance_b the indices would then be separated for each datahub instance similar to instance_a_index1 and instance_b_index1
@David Leifker while disabling the DATAHUB_ANALYTICS_ENABLED=false helped in running the elasticSearchSetupJob, the next step datahub-datahub-system-update-job failed with the following error in the BuildIndicesPreStep stage
org.elasticsearch.ElasticsearchStatusException: method [HEAD], host [<http://hostname:80>], URI [/graph_service_v1?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false], status line [HTTP/1.1 400 Bad Request]
Any pointers?
I invoked the URL manually from our env and see the query params are not getting recognized contains unrecognized parameter: [ignore_unavailable]"},"status":400}
What version of ES is recommended for datahub v0.10.5?
Upon a deeper look into the logs, we see the following error in the systemUpdate job. prior to the BuildIndicesPreStep stage. Do you think this might be the error for the BuildIndicesPreStep stage to fail? 2023-09-19 13:46:39,377 [kafka-producer-network-thread | producer-2] ERROR o.a.kafka.common.utils.KafkaThread:51 - Uncaught exception in thread 'kafka-producer-network-thread | producer-2':java.lang.OutOfMemoryError: Java heap space at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61) at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348) at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580) at org.apache.kafka.common.network.Selector.poll(Selector.java:485) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:550) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:324) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at java.base/java.lang.Thread.run(Thread.java:829)
> we use AWS open search service 6.5
Does this mean you are running in legacy mode for ElasticSearch 6.5? Or was it a typo meaning you’re using OpenSearch 2.5? Both are unsupported versions though, we support Opensearch 1.x and ElasticSearch 7.10 -> less than 8 (i.e. 7.17 or others should also be fine)