<@U06TM7M40RK> how to add custom properties to dataset in ingestion receipe
To add custom properties to a dataset during ingestion in DataHub, you can use the simple_add_dataset_properties
transformer. This transformer allows you to specify key-value pairs that will be added as custom properties to the dataset. Here’s how you can configure it in your ingestion recipe YAML file:
- Basic Configuration: Add custom properties to a dataset.
- Replace Existing Properties: Replace existing dataset properties sent by the ingestion source.
- Overwrite Existing Properties: Overwrite the dataset properties available for the dataset on DataHub GMS.
- Patch Existing Properties: Keep the existing dataset properties and add new ones.
Basic Configuration
transformers:
- type: "simple_add_dataset_properties"
config:
properties:
prop1: value1
prop2: value2
Replace Existing Properties
transformers:
- type: "simple_add_dataset_properties"
config:
replace_existing: true # false is the default behavior
properties:
prop1: value1
prop2: value2
Overwrite Existing Properties
transformers:
- type: "simple_add_dataset_properties"
config:
semantics: OVERWRITE # OVERWRITE is the default behavior
properties:
prop1: value1
prop2: value2
Patch Existing Properties
transformers:
- type: "simple_add_dataset_properties"
config:
semantics: PATCH
properties:
prop1: value1
prop2: value2
These configurations will add the specified properties to the dataset during the ingestion process.
Would you like more detailed instructions or have any specific questions about this process?
Sources:
- <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/docs/transformer/dataset_transformer.md|Dataset Transformer Documentation>
- <https://github.com/datahub-project/datahub/blob/master/docs/api/tutorials/custom-properties.md|DataHub Properties Documentation> 1 button 0 button Hint: Mention <@U06TM7M40RK> in the thread for followups.
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a or so other folks in the Community know if you’re still looking for help on the topic!
I’m not able to find any data sources to answer your question. Could you provide more details about the specific issue you are facing or the type of assistance you need with DataHub? If this doesn’t help, please reach out to the DataHub support team for further assistance. 0 button 0 button Hint: Mention @runllm in the thread for followups.