Troubleshooting S3 URN Listing Issues in DataHub Version 0.13.0

Original Slack Thread

Hi all! We are having some problems with listing our s3 URNs. Currently there are multiple buckets ingested in DataHub (0.13.0). However, when we try to retrieve it, it will return 2 buckets with one completely different bucket name:[*urns] = dh_graph_client.get_urns_by_filter(
platform=“s3”,
platform_instance=platform_instance,
query=‘bucket_name=joel-test-2’
)attachment

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

The other bucket name is called dshdev-test-bucket

<@U05DVT3RDUG> .

Rather than using the query⁣ parameter, can you try by passing a filter?

    platform="s3",
    extraFilters=[ { "field": "customProperties", "values": ["bucket_name=joel-test-2"], "condition": "EQUAL" } ]
)