Developing a Custom Ingestion Source for Omni BI in DataHub

Original Slack Thread

<@U06TM7M40RK> how can I ingest a new source called Omni BI (the connector doesn’t exist so we would need to develop it) ? what would we need?

To ingest metadata from a new source like Omni BI into DataHub, you will need to develop a custom ingestion source. Here are the steps you should follow:

  1. Set Up Your Environment: Ensure you have Python 3.8+ and Java 17 installed, as these are required for developing on DataHub’s metadata ingestion framework. You can set up your Python environment by following the instructions in the <https://datahubproject.io/docs/0.14.0/metadata-ingestion/developing|Developing on Metadata Ingestion> guide.

  2. Create a Configuration Model: Use Pydantic to define a configuration model for your source. This involves creating a class that inherits from ConfigModel and defining the necessary configuration fields. An example can be found in the <https://datahubproject.io/docs/0.14.0/metadata-ingestion/adding-source|Adding a Metadata Ingestion Source> documentation.

  3. Implement the Source: Develop the core logic for your source by implementing the get_workunits_internal method. This method should produce a stream of metadata events wrapped in MetadataWorkUnit objects. You can refer to existing sources for examples.

  4. Set Up the Reporter: Implement a reporter interface to log statistics, warnings, and other information about the ingestion run. You can use the default SourceReport class or extend it as needed.

  5. Testing and Documentation: Although optional if you are not contributing back to the DataHub project, it is recommended to write tests using the pytest framework and document your source for future reference.

  6. Use the Source: Once your custom source is developed, you can use it to ingest metadata into DataHub. You can follow the <https://datahubproject.io/docs/how/add-custom-ingestion-source|guide on using custom ingestion sources> without forking DataHub.

Would you like more detailed guidance on any of these steps?

Sources:

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><