Best Practices for Writing Unit Tests for Inspecting Metadata Change Proposals

Original Slack Thread

Hello Friends-- Not sure where to put this, but I’m wondering what the best practice is to write unit tests that inspect metadata change proposals. For example, I have a method that creates an MCP for BrowsePaths. I would really like something like this to work, but it does not, probably because I don’t understand how the avro (de)serialization is working under the covers. Anyone have thoughts? i’m using 0.13.1

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

We have some unit tests, e.g. https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/tests/unit/test_bigquery_lineage.py|here, that directly check the aspects created of helper methods. Most often, though, we write “integration” tests where we create “golden” files representing the expected MCPs, then use:

to compare the files, as done https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/tests/integration/bigquery_v2/test_bigquery.py|here. You can generate the golden file by passing --update-golden-files⁣ when you run the test.

Interesting, thank you!