Ingesting a Specific Table from Source Platform into DataHub using Recipe or CLI

Original Slack Thread

<@U06TM7M40RK> how to ingest exactly 1 specific table from source platform using recipe or cli ?

To ingest a specific table from a source platform into DataHub using a recipe or the CLI, you need to create an ingestion recipe that specifies the source platform and the table you want to ingest. Here is a step-by-step guide:

  1. Create an Ingestion Recipe: Write a YAML configuration file (recipe) that specifies the source of the data and the destination (DataHub).

  2. Specify the Table: In the source configuration, specify the table you want to ingest.

  3. Run the Ingestion: Use the DataHub CLI to run the ingestion based on the recipe.

Example: Ingesting a Specific Table from MySQL

Step 1: Create an Ingestion Recipe

Create a YAML file (e.g., mysql_table_ingestion.yaml) with the following content. This file will define the metadata to be ingested.

source:
  type: "mysql"
  config:
    host_port: "localhost:3306"
    database: "your_database"
    username: "your_username"
    password: "your_password"
    table_pattern:
      allow:
        - "your_table_name"  # Specify the table you want to ingest

sink:
  type: "datahub-rest"
  config:
    server: "http://localhost:8080"  # Replace with your DataHub instance URL
    token: "your-auth-token"  # Replace with your DataHub auth token if required

Step 2: Run the Ingestion

Use the DataHub CLI to run the ingestion command with the created recipe.

datahub ingest -c mysql_table_ingestion.yaml

This command will read the metadata for the specified table from the MySQL database and ingest it into your DataHub instance.

Additional Information

Would you like more detailed steps or additional information on any specific part of the process?

Sources: