Troubleshooting ELASTICSEARCH_MAIN_TOKENIZER Setting for Korean Searching

Original Slack Thread

Hi, Where can I find the datahub-gms setting for ELASTICSEARCH_MAIN_TOKENIZER?
I think it is not working properly on my settings.

Purpose
• Trying to change ELASTICSEARCH_MAIN_TOKENIZER to nori_tokenizer.
Target Environment Setting
• run by following scripts
ELASTICSEARCH_MAIN_TOKENIZER=nori_tokenizer datahub docker quickstart
• when the elastic search container launches healthy, install the elasticsearch-plugin.

[root@elasticsearch elasticsearch]# elasticsearch-plugin install analysis-nori```

<@UV5UEC3LN> Would love your help here!

That’s the right environment variable and should work, what behavior do you see that indicates it’s not working?

Expected Korean searching with 2-letter words like ‘이름’ and ‘주소,’ but it didn’t work.
Thank you for checking.

Anyway, I found a solution by modifying the index setting from a minimum length of 3 to 2 in Elasticsearch.

Ah yes, there is a 3 character limit setting by default for performance reasons on partial matches. Glad you were able to resolve!