Troubleshooting DataHub entity deletion methods

Original Slack Thread

Hello. Sorry if it’s wrong chat, but did someone figured out the solution?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

Could you please try by specifying the --platform
datahub delete --platform airflow --entity-type dataProcessInstance

<@U0445MUD81W> thanks for your advice, but it also didn’t work.
We ingest entities (platforms, dataFlow/dataJob objects etc) using python SDK from scripts, so it’s not an airflow.
However I tried this command (‘geo’ is our custom platform, which we ingested before):


[2024-05-09 17:06:24,120] INFO     {datahub.cli.delete_cli:341} - Using DataHubGraph: configured to talk to <http://localhost:8080>
Found no urns to delete. Maybe you want to change your filters to be something different?```
However, such platform exists, and there are dataProcessInstance objects created for tasks in it:
```datahub delete by-filter --dry-run --platform geo

[...]
[Dry-run] Would delete 2120 entities (impacts 2120 versioned rows and 0 timeseries aspect rows).```
The only way I found working is deleting each dataProcessInstance entity by full `--urn`, which can be copied from browser. This deletes the run from run history. It’s obvious that it’s not really useful :disappointed: .![attachment]({'ID': 'F0733SAAHG9', 'EDITABLE': False, 'IS_EXTERNAL': False, 'USER_ID': 'U06LZMTTZGD', 'CREATED': '2024-05-09 15:08:52+00:00', 'PERMALINK': 'https://datahubspace.slack.com/files/U06LZMTTZGD/F0733SAAHG9/screenshot_2024-05-09_at_17.08.48.png', 'EXTERNAL_TYPE': '', 'TIMESTAMPS': '2024-05-09 15:08:52+00:00', 'MODE': 'hosted', 'DISPLAY_AS_BOT': False, 'PRETTY_TYPE': 'PNG', 'NAME': 'Screenshot 2024-05-09 at 17.08.48.png', 'IS_PUBLIC': True, 'PREVIEW_HIGHLIGHT': None, 'MIMETYPE': 'image/png', 'PERMALINK_PUBLIC': 'https://slack-files.com/TUMKD5EGJ-F0733SAAHG9-31d1822560', 'FILETYPE': 'png', 'EDIT_LINK': None, 'URL_PRIVATE': 'https://files.slack.com/files-pri/TUMKD5EGJ-F0733SAAHG9/screenshot_2024-05-09_at_17.08.48.png', 'HAS_RICH_PREVIEW': False, 'TITLE': 'Screenshot 2024-05-09 at 17.08.48.png', 'IS_STARRED': False, 'PREVIEW_IS_TRUNCATED': None, 'URL_PRIVATE_DOWNLOAD': 'https://files.slack.com/files-pri/TUMKD5EGJ-F0733SAAHG9/download/screenshot_2024-05-09_at_17.08.48.png', 'PREVIEW': None, 'PUBLIC_URL_SHARED': False, 'MESSAGE_TS': '1715267394.326739', 'PARENT_MESSAGE_TS': '1715184338.329459', 'MESSAGE_CHANNEL_ID': 'C029A3M079U', '_FIVETRAN_DELETED': False, 'LINES_MORE': None, 'LINES': None, 'SIZE': 872136, '_FIVETRAN_SYNCED': '2024-05-12 13:00:55.280000+00:00'})

other solution I can suggest GQL

  1. get all URN’s by platform & entity type
  2. delete using urn
    Here are the queries
              search(input: {type: DATA_PROCESS_INSTANCE, 
               query: "*", start: 0, count: 1000
                  orFilters: [
                  {
                    and: [
                    {
                          field: "platform"
                          values: ["geo"]
                          condition: CONTAIN
                      }
                    ]
                  }
                ]
              }
              ){
                start
                count
                total
                searchResults {
                  entity {
                    urn
          
                    ... on DataProcessInstance {
                        urn
                      name
                    }
                    }

                  }
                }
            }```
  deleteQuery(urn:"&lt;urn&gt;")
}```

<@U0445MUD81W> the same thing as from CLI – it doesn’t find any dataProcessInstance there… :sweat_smile:. But thanks for an idea.![attachment]({‘ID’: ‘F073E62P9U0’, ‘EDITABLE’: None, ‘IS_EXTERNAL’: None, ‘USER_ID’: None, ‘CREATED’: None, ‘PERMALINK’: None, ‘EXTERNAL_TYPE’: None, ‘TIMESTAMPS’: None, ‘MODE’: ‘tombstone’, ‘DISPLAY_AS_BOT’: None, ‘PRETTY_TYPE’: None, ‘NAME’: None, ‘IS_PUBLIC’: None, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: None, ‘PERMALINK_PUBLIC’: None, ‘FILETYPE’: None, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: None, ‘HAS_RICH_PREVIEW’: None, ‘TITLE’: None, ‘IS_STARRED’: None, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: None, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: None, ‘MESSAGE_TS’: ‘1715269165.133779’, ‘PARENT_MESSAGE_TS’: ‘1715184338.329459’, ‘MESSAGE_CHANNEL_ID’: ‘C029A3M079U’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: None, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 13:00:55.383000+00:00’})![attachment]({‘ID’: ‘F072JP2FX9U’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U06LZMTTZGD’, ‘CREATED’: ‘2024-05-09 15:38:14+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-09 15:38:14+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘Screenshot 2024-05-09 at 17.38.11.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F072JP2FX9U-cd5a38f7ac’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘Screenshot 2024-05-09 at 17.38.11.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715269165.133779’, ‘PARENT_MESSAGE_TS’: ‘1715184338.329459’, ‘MESSAGE_CHANNEL_ID’: ‘C029A3M079U’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 361969, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 13:00:55.383000+00:00’})

![attachment]({‘ID’: ‘F072NFCLNKF’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U06LZMTTZGD’, ‘CREATED’: ‘2024-05-09 15:40:56+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-09 15:40:56+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘Screenshot 2024-05-09 at 17.40.54.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F072NFCLNKF-9a60e334e8’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘Screenshot 2024-05-09 at 17.40.54.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715269259.477349’, ‘PARENT_MESSAGE_TS’: ‘1715184338.329459’, ‘MESSAGE_CHANNEL_ID’: ‘C029A3M079U’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 114075, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 13:00:55.410000+00:00’})

just this query

              search(input: {type: DATA_PROCESS_INSTANCE, 
               query: "*", start: 0, count: 10
            
              }
              ){
                start
                count
                total
                searchResults {
                  entity {
                    urn
                   
                    ... on DataProcessInstance {
                        urn
                      name
                    }
                    }

                  }
                }
            }```