Ingesting Column Descriptions into DataHub from a File

user-2 · April 28, 2025, 12:03am

<@U06TM7M40RK> is it possible to ingest column descriptions from a file, or do I have to enter them manually?

user-1 · April 28, 2025, 12:03am

Yes, it is possible to ingest column descriptions from a file into DataHub. You can use the CSV Enricher to bulk upload metadata such as descriptions, tags, glossary terms, and owners. This process involves preparing a CSV file with the necessary metadata and using the DataHub CLI to ingest the file.

According to the <https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv|CSV Enricher documentation>, you can create a CSV file with columns like resource, subresource, description, etc., and then use a YAML recipe to configure the ingestion process. Here is a brief overview of the steps:

Prepare the CSV File: Create a CSV file with the necessary metadata. The CSV should have a header row and columns such as resource, subresource, description, etc.

Example CSV:

resource,subresource,description
"urn:li:dataset:(urn:li:dataPlatform:postgres,post.user.table1,PROD)",,"Table 1 description"
"urn:li:dataset:(urn:li:dataPlatform:postgres,post.user.table1,PROD)",column1,"Column 1 description"

Create a Recipe File: Write a YAML configuration file for the ingestion process.

Example Recipe:

source:
  type: csv-enricher
  config:
    filename: ./path/to/your/file.csv
    write_semantics: PATCH  # or OVERRIDE

Run the Ingestion: Use the DataHub CLI to ingest the CSV file.
```
datahub ingest -c path/to/your/recipe.yaml
```

This process will update the descriptions for the specified tables and columns in DataHub.

Would you like more detailed steps or examples on any part of this process?

Sources:

<https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv|CSV Enricher documentation> 0 button 0 button Hint: Mention @runllm in the thread for followups.

user-2 · April 28, 2025, 12:03am

This link is broken: https://datahubproject.io/docs/0.13.1/generated/ingestion/sources/csv

Topic		Replies	Views
Updating Descriptions of Ingested Assets Using CSV Enricher in DataHub ingestion	17	76	March 17, 2025
Capturing Column-Level Descriptions from Vertica Tables for DataHub Integration ingestion	9	21	October 28, 2024
Ingesting Data Using DataHub UI Guidance - Step-by-Step Instructions ingestion	4	344	June 17, 2024
Adding MongoDB Column Descriptions to DataHub - Version 0.13.0 Integration ingestion	2	59	April 15, 2024
Ingesting a Specific Table from Source Platform into DataHub using Recipe or CLI ingestion	1	75	September 2, 2024

Ingesting Column Descriptions into DataHub from a File

Related topics