Bulk Deleting Datasets with Prefix test_* in DataHub

Original Slack Thread

Hi <@U06TM7M40RK>, how do I bulk delete datasets with prefix test_*?

To bulk delete datasets with a prefix test_* in DataHub, you can use the DataHub CLI with the --query option to filter datasets by their name. Here’s a step-by-step guide:

  1. Use the datahub delete Command: Utilize the datahub delete command with the --query option to specify the prefix test_*.

  2. Perform a Hard Delete: Add the --hard flag to ensure that the deletion is a hard delete.

  3. Dry Run: It is recommended to use the --dry-run flag initially to see what entities will be deleted before actually performing the delete operation.

Here is an example command:

datahub delete --entity-type dataset --query "test_*" --hard --dry-run

Example Command Breakdown:

  • --entity-type dataset: Specifies that the entity type is a dataset.
  • --query "test_*": Filters datasets with the prefix test_*.
  • --hard: Performs a hard delete, which physically deletes all rows for all aspects of the entity.
  • --dry-run: Shows what entities will be deleted without actually performing the deletion.

Once you are sure about the entities to be deleted, you can remove the --dry-run flag to perform the actual deletion:

datahub delete --entity-type dataset --query "test_*" --hard

Would you like more detailed information on any specific part of this process?

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><