Using DataHub Python SDK to Perform GraphQL Search Operations

Original Slack Thread

This message was deleted.

Hey there! :wave:

<@U06TM7M40RK> Can you help with the questions in this thread?

:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!

To perform searches using the DataHub Python SDK similar to what you can do with the DataHub UI, you can utilize the DataHubGraph client to interact with the GraphQL API. The Python SDK does not provide a direct method for search operations, but you can achieve this by executing GraphQL queries through the SDK.

Here is an example of how you can perform a search using the Python SDK by executing a GraphQL query:

  1. Install the DataHub Python SDK:
    Ensure you have the DataHub Python SDK installed. You can install it using pip if you haven’t already:

    pip install acryl-datahub
    
  2. Set Up the DataHubGraph Client:
    Initialize the DataHubGraph client with your DataHub configuration.

  3. Define and Execute the GraphQL Query:
    Use the DataHubGraph client to execute a GraphQL query for searching entities.

Here is a sample code snippet to perform a search:

import requests
from datahub.ingestion.graph.client import DataHubGraph

# Initialize the DataHubGraph client
config = {
    "server": "http://localhost:8080",  # Replace with your DataHub server URL
    "token": "YOUR_ACCESS_TOKEN"  # Replace with your access token
}
client = DataHubGraph(config)

# Define the GraphQL query
query = """
query search($input: SearchInput!) {
  search(input: $input) {
    start
    count
    total
    searchResults {
      entity {
        urn
        type
        ... on Dataset {
          name
          lastIngested
        }
      }
    }
  }
}
"""

# Define the input variables
variables = {
  "input": {
    "type": "DATASET",
    "query": "*",
    "start": 0,
    "count": 10,
    "orFilters": [
      {
        "and": [
          {
            "field": "urn",
            "values": ["urn:li:dataset:(urn:li:dataPlatform:mysql,datahub.metadata_aspect_v2,PROD)"],
            "condition": "EQUALS"
          }
        ]
      }
    ]
  }
}

# Execute the GraphQL query
response = client.execute_graphql(query, variables)
print(response)

This example demonstrates how to perform a search for datasets with a specific urn using the GraphQL API through the Python SDK. You can modify the query and variables to suit your specific search requirements.

Would you like more detailed guidance on setting up this GraphQL query or any additional information?

Sources:

  • <Slack Thread> :+1: 0 button :-1: 0 button :bulb: Hint: Mention @runllm in the thread for followups.

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><