This message was deleted.
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a or so other folks in the Community know if you’re still looking for help on the topic!
To help you more effectively, could you please provide the following information?
- Are you using the UI or CLI for ingestion?
- Which DataHub version are you using? (e.g., 0.12.0)?
- What data source(s) are you integrating with DataHub (e.g., BigQuery)?
Based on your initial question, it seems like you might need to implement a custom ingestion source to handle the XML metadata from the API endpoint. For more details on how to add a custom ingestion source, you can refer to the <https://datahubproject.io/docs/0.13.1/how/add-custom-ingestion-source|DataHub documentation>.
Would you like more detailed guidance on how to proceed with creating a custom ingestion source?
Sources:
- <https://datahubproject.io/docs/0.13.1/how/add-custom-ingestion-source|DataHub documentation> 0 button 0 button Hint: Mention @runllm in the thread for followups.
I’m using datahub 0.13. I’m using UI for ingestion.
<@U06TM7M40RK> what is MetadataChangeEvent?
A Metadata Change Event (MCE) is a type of event in DataHub that represents a request to change multiple aspects of the same entity. It leverages a deprecated concept of Snapshot
, which is a strongly-typed list of aspects for the same entity. MCEs are considered “proposals” for a set of metadata changes, as opposed to Metadata Audit Events (MAEs), which convey committed changes. Only successfully accepted and processed MCEs will lead to the emission of a corresponding MAE or Metadata Change Logs (MCLs).
Key Points about MCE:
- Emission: MCEs may be emitted by clients of DataHub’s low-level ingestion APIs during the process of metadata ingestion.
- Consumption: DataHub’s storage layer actively listens for new Metadata Change Events and attempts to apply the requested changes to the Metadata Graph.
- Schema: The structure of an MCE is defined by its PDL schema.
Example of an MCE:
An example of an MCE emitted to change the ‘ownership’ aspect for an Entity:
{
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.DatasetSnapshot": {
"urn": "urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)",
"aspects": [
{
"com.linkedin.pegasus2avro.common.Ownership": {
"owners": [
{
"owner": "urn:li:corpuser:jdoe",
"type": "DATAOWNER",
"source": null
},
{
"owner": "urn:li:corpuser:datahub",
"type": "DATAOWNER",
"source": null
}
],
"lastModified": {
"time": 1581407189000,
"actor": "urn:li:corpuser:jdoe",
"impersonator": null
}
}
}
]
}
}
}
For more detailed information, you can refer to the <Metadata Events | DataHub Change Event documentation>.
Would you like more detailed guidance on how to work with MCEs or any other specific aspect of DataHub?
Sources:
- <Metadata Events | DataHub Change Event documentation> 1 button 0 button Hint: Mention <@U06TM7M40RK> in the thread for followups.