<@U06TM7M40RK> am I able to use the dbt ingestion source but load the ingestion into a custom data platform i have created, i.e. not the dbt platform?
Yes, you can use the dbt ingestion source to load metadata into a custom data platform that you have created in DataHub. According to the <https://datahubproject.io/docs/0.14.0/how/add-custom-data-platform|DataHub documentation>, you can add a custom data platform by either changing the default data platforms before deployment or ingesting a new data platform at runtime. This allows you to associate datasets with your custom platform instead of the default dbt platform.
To achieve this, you would need to:
-
Add a Custom Data Platform: Define your custom data platform in the
data_platforms.json
file or use the DataHub CLI to add it at runtime. For example, you can use the following CLI command:datahub put platform --name MyCustomDataPlatform --display_name "My Custom Data Platform" --logo "https://<your-logo-url>"
-
Configure the Ingestion: When setting up your dbt ingestion, ensure that the target platform is set to your custom data platform. This can be done by specifying the custom platform in your ingestion configuration.
If you need more detailed guidance on any of these steps, please let me know!
Sources:
- <https://datahubproject.io/docs/0.14.0/how/add-custom-data-platform|Adding a custom Dataset Data Platform> 0 button 0 button Hint: Mention @runllm in the thread for followups.
<@U06TM7M40RK> yes i like more detailed steps on how to configure the ingestion recipe for the dbt source to use a custom data platform rather than default to dbt
as the data platform. Or is it the case that i can only configure the target platform?
To configure the ingestion recipe for the dbt source to use a custom data platform, you need to follow these steps:
-
Add a Custom Data Platform: First, ensure that your custom data platform is added to DataHub. You can do this by modifying the
data_platforms.json
file or using the DataHub CLI to add it at runtime. Here’s how you can add it using the CLI:datahub put platform --name MyCustomDataPlatform --display_name "My Custom Data Platform" --logo "https://<your-logo-url>"
This command will register your custom data platform with DataHub.
-
Configure the dbt Ingestion Recipe: In your dbt ingestion recipe, you need to specify your custom data platform as the
target_platform
. This will ensure that the datasets ingested from dbt are associated with your custom platform.Here is an example configuration for a dbt ingestion recipe using a custom data platform:
source: type: dbt config: platform_instance: my_custom_instance target_platform: MyCustomDataPlatform # Use your custom data platform here manifest_path: path/to/manifest.json catalog_path: path/to/catalog.json # ... other configurations
-
Run the Ingestion: Execute the ingestion process using the configured recipe. This will ingest the dbt metadata into DataHub, associating it with your custom data platform.
By setting the target_platform
to your custom data platform, you ensure that the metadata is ingested under the correct platform rather than defaulting to dbt
.
If you need further assistance or more detailed steps on any specific part of this process, feel free to ask!
Sources:
- <https://datahubproject.io/docs/0.14.0/how/add-custom-data-platform|Adding a custom Dataset Data Platform> 0 button 0 button Hint: Mention @runllm in the thread for followups.
Hi ! Did this interaction resolve your issue or do you still require assistance ?
Please tag @ RunLLM if you still need further assistance !
Or tag @Datahub Community Support if you require a person to take a look
Hope this helps ! ><