Customizing DataHub Search Method for Chinese Language Users

Original Slack Thread

hi, all. is there way to custom datahub search method.
as I know, datahub search query could use something like /q *pattern* to conduct some fuzzy search.
Is there some configuration to make it as a default behavior. thanks.

<@U05CJD391ND> might be able to speak to this!

thanks for your reply.

Hi <@U05AQD305BR> I’m not sure I understand your question. Are you asking what the capabilities of search are? These docs may help but let me know if you had a specific question about the options https://datahubproject.io/docs/how/search/

<@U05CJD391ND> thanks for your reply.
I have carefully read the documentation you provided.
My question is not about the capabilities of search. My problem is I want to change the default behavior of search.
I am a datahub user whose native language is Chinese. Chinese is not a tokenized language like English, English sentences is composed of words which is separated by spaces, while Chinese has no spaces between words.
So the default behavior of search in datahub(which I think is match by prefix) is not suit for searching Chinese sentences.
and I found that if we add a /q **word** in the search query, the datahub will find out words not only by prefix, but also words that appear in the middle of a whole word. I want this kind of search to be set as a default behavior, that means users need not to query with /q *word* explicitly, but use word to query directly, and datahub will find the results as like they are using /q * * .
Is it possible to configure datahub search to behave like that.

great thanks for your help.

Gotcha, so you’re wondering if datahub can be configured to search by default with a /q operator. I’m wondering if this doc may help you out. I haven’t tried it out myself, so I can’t guarantee it’s going to let you do the /q **word** override like you want. But take a look and see if it might help https://datahubproject.io/docs/how/search/#customizing-search