Ingesting CSV File and Updating DataHub With Changes

Original Slack Thread

Hello! Can you help me? I’m trying to ingest a CSV file. If i change the file, changes are not refflected on datahub. I.E: Add new tags.

does the dataset schema exist prior to ingestion?

umm. No. I’m trying to create an example to show to our client. I’m trying to simulate a table of a PostgreSQL database.

because we don’t have direct access to database

But i need to create an example by a dummy data similar to client’s data. Because he can’t understand datahub examples

you cant create a schema using csv ingestion. csv ingestion can only add tags/description to an existing schema, not create one

you can either ingest a postgres or create your own schema like this https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/library/dataset_schema_with_tags_terms.py

OK, if i have a CSV dataset on github, can i use S3 ingestion?

<@U05P9679JEB>
You can clone the repo and configure the folder path in s3 recipe (https://datahubproject.io/docs/generated/ingestion/sources/s3)

Thanksss