The Status of Stemming and Synonyms Support in Datahub

Original Slack Thread

Hello Everyone!
I was reading about stemming and synonyms support around the beginning of the summer and thought it is a really useful feature. But since then it has disappeared from the Datahub documentation. I found the link to it in this post https://datahubspace.slack.com/archives/CV2UVAPPG/p1690483464290849
but it is not working. So I am wondering if this feature has been removed or deprecated? I would like to implement it.
Thank you!

Hi <@U03D58YUFDX>
I believe the feature is still present in datahub. For some reason, the doc has been removed in in the 0.10.5 release.

I can confirm synonyms and stemming is working as expected in the datahub demo website.

Hi <@U05FBH7FF29>,
Thank you so much for the quick reply!
I haven’t kept any notes about configuring this functionality as I hoped to do it in the autumn. Do you have any hints what needs to be changed in values.xml or any other pointers how to enable it? Or maybe a pointer to the git page?
Thank you!

As I mentioned in my mentioned thread, we have to add the synonyms in the metadata-io module in resources. Then we have to build and redeploy the services. It seems less flexible and I am planning to make it better and raise a pr for the same.

https://github.com/datahub-project/datahub/blob/releases/v0.10.4.1/docs/architecture/stemming_and_synonyms.md

Thank you so much <@U05FBH7FF29> for the clarification and the repo link!