Checking Extracted Raw Data Before Ingesting to DataHub and Troubleshooting Missing UpstreamLineage Values

Original Slack Thread

Hi everyone!
Does anybody know how to check the extracted raw data from the ingestion source, before it gets ingested to DataHub?
Because some of the dataset entities are missing UpstreamLineage values and I want to find out why.

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

you can use file as sink output metadata to a file

  type: file
  config:
    filename: /path/to/mce/file.json```
<https://datahubproject.io/docs/metadata-ingestion/sink_docs/file>

Okay, thank you so much!