Integrating spark lineage listener: Choosing the correct library version

Original Slack Thread

we are currently integrating the spark lineage listener and wonder which version of the library one should use. The documentation https://datahubproject.io/docs/0.13.1/metadata-ingestion/source_overview has two main sections (in the left menu) named “Spark”. One points to https://search.maven.org/search?q=a:datahub-spark-lineage and the other one to https://search.maven.org/search?q=a:acryl-spark-lineage - datahub-spark-lineage seems to be synchronized with datahub release versions, is this the correct one to use?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

I recommend to use the new spark lineage plugin -> https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta

which is the acryl-spark-lineage maven artifact id then?

yes

the release of that is older than the datahub variant

what is the main difference?

is the non-beta going out of support?

Yes, it is not maintained and will be replace by the new one

the one is based on OpenLineage and it has much more feature and much more reliable

thanks. i want to say “makes sense” but i am not fully there yet :slightly_smiling_face:

haha

ok. getting there :slightly_smiling_face:

be aware of the fact, that you have two entries in the documentation with exactly the same title

where?

I can update it

this is how it looks like now ->![attachment]({‘ID’: ‘F07287VK74M’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘UV14447EU’, ‘CREATED’: ‘2024-05-07 12:23:35+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-07 12:23:35+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘CleanShot 2024-05-07 at 14.23.24@2x.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F07287VK74M-5927641105’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘CleanShot 2024-05-07 at 14.23.24@2x.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715084618.075409’, ‘PARENT_MESSAGE_TS’: ‘1715084337.906679’, ‘MESSAGE_CHANNEL_ID’: ‘CUMUWQU66’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 26110, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 08:22:13.937000+00:00’})

![attachment]({‘ID’: ‘F072DK5AXAQ’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U070QPKCFEY’, ‘CREATED’: ‘2024-05-07 12:23:52+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-07 12:23:52+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘image.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F072DK5AXAQ-28a67b36e7’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘image.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715084636.927259’, ‘PARENT_MESSAGE_TS’: ‘1715084337.906679’, ‘MESSAGE_CHANNEL_ID’: ‘CUMUWQU66’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 1164969, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 08:22:13.975000+00:00’})

for 0.13.1 though

ahh, yes, I fixed it in the meantime