<@U06TM7M40RK> how to use _DATAHUB_TO_FILE
The __DATAHUB_TO_FILE_
directive in DataHub is used to handle sensitive data by loading it as files within ingestion recipes. This is particularly useful for data sources that require file paths on a local file system, such as Kafka, BigQuery, and MySQL, especially in scenarios where the ingestion process needs to be self-sufficient, like UI ingestion.
How to Use __DATAHUB_TO_FILE_
-
Syntax: The directive is used in the format
__DATAHUB_TO_FILE_<property>: <value>
. This will be converted into<property>: <path to file containing value>
. The value can be specified inline or using an environment variable or secret. -
Example Configuration:
- Suppose you need to use SSL with MySQL and have a sensitive key that should not be hard-coded in the recipe. You can use the directive as follows:
source: type: mysql config: host_port: localhost:3306 database: dbname username: root password: example options: connect_args: __DATAHUB_TO_FILE_ssl_key: '${secret}' # This will be converted to a file path
In this example, the
ssl_key
is specified using the__DATAHUB_TO_FILE_
directive, which will convert the secret into a file path where the contents of the secret are stored. -
Purpose: This approach enhances security and flexibility by ensuring that sensitive data is not hard-coded in the recipes but is instead managed through file paths, which can be dynamically generated and managed.
Would you like more detailed steps or additional information on any specific part of the process?
Sources:
- <Sign in to GitHub · GitHub Document _DATAHUB_TO_FILE directive>
- https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/recipe_overview.md|metadata-ingestion/recipe_overview.md
0 button
0 button
Hint: Mention @runllm in the thread for followups.
Hi <@U08PVDCPJAG>, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.