Troubleshooting Airflow Lineage Emission with Datahub Package Versions

Original Slack Thread

Hey all, suddenly our Airflow lineage emission using the Datahub Package just stopped working as expected. We are getting an empty Inlet, but outlet is getting populated correctly. What is confusing is that the properties tab still shows the inlets and outlets properly in Airflow as Datasets, but it is not getting correctly emiited to datahub as an input to the Airflow task. This is happening kind of randomly, and we made no changes that I can identify that would have caused it.

In case someone else sees this issue, it is going to be fixed by https://github.com/datahub-project/datahub/pull/8631

:saluting_face:

<@U01GZEETMEZ> I’m seeing this issue on plugin version 0.10.5.5. What version do I need to upgrade to for the fix?

We haven’t cut a proper cli release in a bit, but the newest rc releases have the fix e.g. 0.10.5.6rc4

We’re planning on cutting a non-rc release this week

<@U01GZEETMEZ> thank you - we’ll wait for the non-rc release. Just to clarify, would we need to upgrade just the plugin? Or would we need to upgrade DataHub itself as well. We are on 0.10.5 for DataHub (no minor release version specified in our build)

Just upgrading the plugin should be fine

Ran into this issue as well. Has anyone upgraded to 0.11.0 version of the plugin yet? I’m getting an error Exception: The package 'acryl-datahub' from setuptools and acryl-datahub-airflow-plugin do not match. Please make sure they are aligned

<@U05KW29UW84> I ran into the same error - I think the plugin has to be upgraded at the same time as DataHub itself. We haven’t actually done this yet - waiting to see if there will be a quick patch for any potential bugs in 0.11.0

Ah cool. Yeah, I think we’re just going to use 0.10.5.1 plugin for now so the Airflow inlets work.

Yup that should be fine as a workaround for now! I’m aware of the error you mentioned and am working on it (along with some other exciting airflow improvements :eyes: )

I ran into the same error, thanks for the workaround!

<@U01GZEETMEZ> i faced the same issue so downgraded the plugin version but with plugin downgrade the import
from datahub_airflow_plugin.entities import Dataset, Urn
stopped working. I am trying example https://github.com/datahub-project/datahub/blob/master/metadata-ingestion-modules/airflow-plugin/src/datahub_airflow_plugin/example_dags/lineage_backend_demo.py|https://github.com/datahub-project/datahub/blob/master/metadata-ingestion-modules/[…]src/datahub_airflow_plugin/example_dags/lineage_backend_demo.py

i checked the code ans seems like these classes are newly introduced. Do we have way in which i can be older version but try example or if you have some other example for older versions let me know

Also for sending any lineage data to datahub do i need to modify existing Dags or we have more existing dags friendly way

With the old version, you can import using from datahub_provider.entities import Dataset

We’re working on some stuff to avoid manually annotating lineage on your dags - stay tuned

This should be fixed in 0.11.0.1

<@U01GZEETMEZ> to clarify, we should be able to upgrade the plugin to 0.11.0.1, but leave DataHub on 0.10.5, right?