Setting up Kafka Connect Connector to Ingest Data from PostgreSQL with DataHub v0.12.1

Original Slack Thread

<@U06TM7M40RK> is there a recipe that works for kafka connect that is connected to rds

The DataHub’s documentation I have ingested does not mention a specific recipe for Kafka Connect that is connected to RDS. You might want to provide more details about the specific RDS database type (e.g., MySQL, PostgreSQL) and the version of DataHub you are using. If this doesn’t help, please reach out to the DataHub support team for further assistance. :+1: 0 button :-1: 0 button :light_bulb: Hint: Mention @runllm in the thread for followups.

By the way, I found these links that might be relevant to what you’re looking for. Let me know if this helps: - Kafka Connect | DataHub - DataHub Documentation

The database is Postgres and the version of datahub is v0.12.1

<@U06TM7M40RK> The database is Postgres and the version of datahub is v0.12.1, I am trying to ingest a kafka connect connector that is connected to PostgreSQL

To ingest a Kafka Connect connector that is connected to a PostgreSQL database using DataHub version 0.12.1, you can follow the general guidelines for setting up a Kafka Connect ingestion source. Here are the steps you can take:

  1. Install the Kafka Connect Plugin: Ensure that you have the Kafka Connect plugin installed for DataHub. You can do this by running the following command:

    pip install 'acryl-datahub[kafka-connect]'
    
  2. Create an Ingestion Recipe: You need to create a YAML file for your ingestion recipe. Here is a sample configuration for a Kafka Connect source:

    source:
      type: "kafka-connect"
      config:
        connect_uri: "http://localhost:8083"  # Replace with your Kafka Connect URI
        username: "admin"  # Replace with your username if needed
        password: "password"  # Replace with your password if needed
        platform_instance_map:
          postgres: postgres_platform_instance  # Map your PostgreSQL instance
    sink:
      type: "datahub-rest"
      config:
        server: "http://localhost:8080"  # Replace with your DataHub server URI
    
  3. Run the Ingestion: Use the following command to run the ingestion:

    datahub ingest -c your_recipe.yml
    
  4. Troubleshooting: If you encounter issues, ensure that your Kafka Connect and PostgreSQL configurations are correct. You may need to adjust the connect_uri, username, and password fields to match your setup.

For more detailed information, you can refer to the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/kafka-connect|Kafka Connect documentation> and the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/postgres|Postgres ingestion documentation>.

Would you like more detailed step-by-step guidance on any of these points?

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><