Configuring DataHub Search Queries with Multiple Tags and AND Operator

Original Slack Thread

<@U06TM7M40RK> when filtering using multiple tags in search, will the search results based on AND operator or the OR operator?

When filtering using multiple tags in a search query in DataHub, the default boolean logic used to interpret text in a query string is AND. This means that all specified tags must be matched for a result to appear.

According to the <https://datahubproject.io/docs/0.14.0/how/search|DataHub search documentation>, the default behavior is to match all filters (AND logic). However, you can change this behavior to match any filter (OR logic) by adjusting the filter settings.

For example, in the Advanced Filter view, you can set the results to match any filter instead of all filters by selecting the “any filter” option from the drop-down menu.

Would you like more details on how to configure this in your search queries?

Sources:

<@U06TM7M40RK> How can I configure to filter based on multiple tags with AND operator?

To configure a search query in DataHub to filter based on multiple tags using the AND operator, you can use the orFilters field in the GraphQL API. This field allows you to specify a list of conjunctive criteria (AND conditions) that must all be satisfied for a result to be returned.

Here is an example of how you can achieve this using the SearchInput object in a GraphQL query:

query searchDataset {
  search(input: {
    type: DATASET, 
    query: "", 
    start: 0, 
    count: 1000,
    orFilters: [
      {
        and: [
          {
            field: "tags",
            values: ["urn:li:tag:tagA"]
          },
          {
            field: "tags",
            values: ["urn:li:tag:tagB"]
          }
        ]
      }
    ]
  }) {
    start
    count
    total
    searchResults {
      entity {
        urn
        type
      }
    }
  }
}

In this example:

  • The orFilters field contains a list with a single and filter.
  • The and filter specifies that both urn:li:tag:tagA and urn:li:tag:tagB must be present for a dataset to be included in the search results.

This configuration ensures that only datasets with both tags will be returned.

Would you like more details on this configuration or any other aspect of filtering in DataHub?

Sources: