Troubleshooting ElasticsearchSetupJob on AWS OpenSearch and Terraforming Policy Creation

user-1 · March 4, 2024, 5:37pm

Hi Team - when I am trying to run elasticsearchSetupJob on our k8s environment, I get the following error

going to use default elastic headers
not using any prefix

 datahub_analytics_enabled: true

&gt;&gt;&gt; GET _opendistro/_ism/policies/datahub_usage_event_policy response code is 400
&gt;&gt;&gt; failed to GET _opendistro/_ism/policies/datahub_usage_event_policy ! -&gt; exiting
2023/09/14 18:13:15 Command exited with error: exit status 1```
Our ES installation is on AWS Open Search and here is my config
```  elasticsearchSetupJob:
    enabled: true
    image:
      repository: image
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 300m
        memory: 256Mi
    extraInitContainers: []
    podSecurityContext:
      fsGroup: 1000
    securityContext:
      runAsUser: 1000
    extraEnvs:
      - name: USE_AWS_ELASTICSEARCH
        value: "true"
      - name: OPENSEARCH_USE_AWS_IAM_AUTH
        value: "true"```
and the host configs are as follows.
```        elasticsearch:
          host: "hostname"
          port: "443"
          skipcheck: "false"
          insecure: "false"
          useSSL: "true"
          region: "us-west-2"```
 Do I have to add anything else to my configs to bypass this error?

user-1 · March 4, 2024, 5:37pm

<@U02TYQ4SPPD> can you help with this?

user-4 · March 4, 2024, 5:38pm

Looks like a permissions issue most likely since it is executing a GET there isn’t much about the request that can be invalid. The elasticsearch-setup job is expected to be able to create the policy for usage events which power the product’s analytics page. The elasticsearch-setup job requires higher permissions then the rest of the containers and doesn’t support IAM authentication. That flag OPENSEARCH_USE_AWS_IAM_AUTH is not applicable for the setup job. The most typical scenario in this case is users are controlling their ES instance outside of the setup job based on their own ES/OS management process. They typically disable the elasticsearch-setup job and perform the setup actions there manually or via their automation tools. Assuming you’ve created the elasticsearch user already then its just a matter of setting up the 3 resources in this section https://github.com/datahub-project/datahub/blob/master/docker/elasticsearch-setup/create-indices.sh#L113|here. This is only executed once and then the normal user is typically granted enough permissions to manage the other application indices with the prefix policy.

user-4 · March 4, 2024, 5:38pm

Links to the policies used by the script are https://github.com/datahub-project/datahub/blob/99d7eb756c09a3313a4c1bda6f96a0953004b58c/metadata-service/restli-servlet-impl/src/main/resources/index/usage-event/aws_es_ism_policy.json#L4|here and https://github.com/datahub-project/datahub/blob/99d7eb756c09a3313a4c1bda6f96a0953004b58c/metadata-service/restli-servlet-impl/src/main/resources/index/usage-event/aws_es_index_template.json#L4|here

user-4 · March 4, 2024, 5:38pm

Those are templates so be sure to replace PREFIX

user-4 · March 4, 2024, 5:38pm

Alternatively, you can disable the analytics tracking entirely and not have the analytics metrics populated on the dashboard in product with DATAHUB_ANALYTICS_ENABLED=false

user-1 · March 4, 2024, 5:38pm

Got it. Thanks <@U02TYQ4SPPD>. This is super helpful

user-1 · March 4, 2024, 5:38pm

I was referring to this https://datahubproject.io/docs/deploy/aws/#elasticsearch-service|doc and added the OPENSEARCH_USE_AWS_IAM_AUTH property to my yaml.

user-1 · March 4, 2024, 5:38pm

let me try setting the ism policy and datahub_usage_event_index_template and see if it works

user-1 · March 4, 2024, 5:38pm

may i know the use of this index template and ism policy ?

user-1 · March 4, 2024, 5:38pm

Also, what is the use of PREFIX ?

user-4 · March 4, 2024, 5:38pm

The prefix is by default an empty string “”, however if you are managing several datahub instances on the same OpenSearch cluster you can separate them by adding a prefix string. For example, instance_a and instance_b the indices would then be separated for each datahub instance similar to instance_a_index1 and instance_b_index1

user-1 · March 4, 2024, 5:38pm

got it.

user-1 · March 4, 2024, 5:38pm

@David Leifker while disabling the DATAHUB_ANALYTICS_ENABLED=false helped in running the elasticSearchSetupJob, the next step datahub-datahub-system-update-job failed with the following error in the BuildIndicesPreStep stage

org.elasticsearch.ElasticsearchStatusException: method [HEAD], host [<http://hostname:80>], URI [/graph_service_v1?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false], status line [HTTP/1.1 400 Bad Request]
Any pointers?

user-1 · March 4, 2024, 5:38pm

I invoked the URL manually from our env and see the query params are not getting recognized
contains unrecognized parameter: [ignore_unavailable]"},"status":400}
What version of ES is recommended for datahub v0.10.5?

user-1 · March 4, 2024, 5:38pm

we use AWS open search service 6.5

user-1 · March 4, 2024, 5:38pm

Upon a deeper look into the logs, we see the following error in the systemUpdate job. prior to the BuildIndicesPreStep stage. Do you think this might be the error for the BuildIndicesPreStep stage to fail?
2023-09-19 13:46:39,377 [kafka-producer-network-thread | producer-2] ERROR o.a.kafka.common.utils.KafkaThread:51 - Uncaught exception in thread 'kafka-producer-network-thread | producer-2':java.lang.OutOfMemoryError: Java heap space at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:61) at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:348) at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:678) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:580) at org.apache.kafka.common.network.Selector.poll(Selector.java:485) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:550) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:324) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at java.base/java.lang.Thread.run(Thread.java:829)

user-1 · March 4, 2024, 5:38pm

<@UV5UEC3LN> This is what I was talking about in our office hours.

datahub_team · March 4, 2024, 5:38pm

> we use AWS open search service 6.5
Does this mean you are running in legacy mode for ElasticSearch 6.5? Or was it a typo meaning you’re using OpenSearch 2.5? Both are unsupported versions though, we support Opensearch 1.x and ElasticSearch 7.10 -> less than 8 (i.e. 7.17 or others should also be fine)

user-1 · March 4, 2024, 5:38pm

This is what we see in our AWS open search UI attachment

Topic		Replies	Views
Troubleshooting Elasticsearch Indices Creation Issue in Datahub Deployment on AWS all-things-deployment	32	357	March 4, 2024
Troubleshooting 'datacatalog-elasticsearch-setup-job' Errors troubleshoot	4	80	May 20, 2024
Troubleshooting AWS DataHub Installation Issues with Docker and Elasticsearch Setup troubleshoot	3	81	March 4, 2024
Dealing with 403 Access Policy Issue in AWS Opensearch Setup Job all-things-deployment	2	61	March 4, 2024
Troubleshooting Slow Elasticsearch Setup with UI Errors troubleshoot	5	62	May 6, 2024

Troubleshooting ElasticsearchSetupJob on AWS OpenSearch and Terraforming Policy Creation

Related topics