<@U06TM7M40RK> is it possible to ingest column descriptions from a file, or do I have to enter them manually?
Yes, it is possible to ingest column descriptions from a file into DataHub. You can use the CSV Enricher to bulk upload metadata such as descriptions, tags, glossary terms, and owners. This process involves preparing a CSV file with the necessary metadata and using the DataHub CLI to ingest the file.
According to the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv|CSV Enricher documentation>, you can create a CSV file with columns like resource
, subresource
, description
, etc., and then use a YAML recipe to configure the ingestion process. Here is a brief overview of the steps:
-
Prepare the CSV File: Create a CSV file with the necessary metadata. The CSV should have a header row and columns such as
resource
,subresource
,description
, etc.Example CSV:
resource,subresource,description "urn:li:dataset:(urn:li:dataPlatform:postgres,post.user.table1,PROD)",,"Table 1 description" "urn:li:dataset:(urn:li:dataPlatform:postgres,post.user.table1,PROD)",column1,"Column 1 description"
-
Create a Recipe File: Write a YAML configuration file for the ingestion process.
Example Recipe:
source: type: csv-enricher config: filename: ./path/to/your/file.csv write_semantics: PATCH # or OVERRIDE
-
Run the Ingestion: Use the DataHub CLI to ingest the CSV file.
datahub ingest -c path/to/your/recipe.yaml
This process will update the descriptions for the specified tables and columns in DataHub.
Would you like more detailed steps or examples on any part of this process?
Sources:
- <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv|CSV Enricher documentation>
0 button
0 button
Hint: Mention @runllm in the thread for followups.
This link is broken: https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv