Difficulty Removing Glossary Terms Assigned from Classification on SQL Sources in DataHub Version 0.13.1

Original Slack Thread

Hi, am using 0.13.1 datahub currently and it is not possible to remove glossary terms in the UI that have been assigned from classification on sql sources. Has anyone else experienced this bugattachment

<@U06TM7M40RK> are you able to assist here?

It seems like you’re experiencing an issue with removing glossary terms that have been assigned from classification on SQL sources in DataHub version 0.13.1. This issue has been reported by other users as well. Specifically, there are known inconsistencies between glossary terms created via the UI and those created via automated classification or API.

According to <Issues · datahub-project/datahub · GitHub GitHub issue>, there are inconsistencies in the URNs of glossary terms created through different methods. Terms created via classification have URNs like urn:li:glossaryTerm:IP%20Address, whereas terms created manually in the UI have URNs like urn:li:glossaryTerm:6782cc0a-b9c3-451a-a523-f21692ef401f. This discrepancy can cause issues in the UI, including difficulties in managing or removing these terms.

Additionally, another <Issues · datahub-project/datahub · GitHub issue> highlights that tags and glossary terms created via the Python API do not render correctly in the DataHub frontend. This includes issues with assigning and removing these terms from datasets within the UI.

To address this issue, you can try the following steps:

  1. Verify URNs: Ensure that the URNs of the glossary terms are consistent and correctly formatted.
  2. Use the CLI or API: If the UI is not functioning as expected, you might need to use the DataHub CLI or API to remove the glossary terms. For example, you can use the datahub delete command with appropriate filters to remove the terms.

Would you like more detailed instructions on how to use the CLI or API to remove the glossary terms, or do you have any other specific questions?

Sources: