we are currently integrating the spark lineage listener and wonder which version of the library one should use. The documentation https://datahubproject.io/docs/0.13.1/metadata-ingestion/source_overview has two main sections (in the left menu) named “Spark”. One points to https://search.maven.org/search?q=a:datahub-spark-lineage and the other one to https://search.maven.org/search?q=a:acryl-spark-lineage - datahub-spark-lineage seems to be synchronized with datahub release versions, is this the correct one to use?
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
I recommend to use the new spark lineage plugin -> https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta
which is the acryl-spark-lineage maven artifact id then?
yes
the release of that is older than the datahub variant
what is the main difference?
is the non-beta going out of support?
Yes, it is not maintained and will be replace by the new one
the one is based on OpenLineage and it has much more feature and much more reliable
thanks. i want to say “makes sense” but i am not fully there yet
haha
ok. getting there
be aware of the fact, that you have two entries in the documentation with exactly the same title
where?
I can update it
this is how it looks like now ->![attachment]({‘ID’: ‘F07287VK74M’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘UV14447EU’, ‘CREATED’: ‘2024-05-07 12:23:35+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-07 12:23:35+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘CleanShot 2024-05-07 at 14.23.24@2x.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F07287VK74M-5927641105’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘CleanShot 2024-05-07 at 14.23.24@2x.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715084618.075409’, ‘PARENT_MESSAGE_TS’: ‘1715084337.906679’, ‘MESSAGE_CHANNEL_ID’: ‘CUMUWQU66’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 26110, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 08:22:13.937000+00:00’})
![attachment]({‘ID’: ‘F072DK5AXAQ’, ‘EDITABLE’: False, ‘IS_EXTERNAL’: False, ‘USER_ID’: ‘U070QPKCFEY’, ‘CREATED’: ‘2024-05-07 12:23:52+00:00’, ‘PERMALINK’: ‘Slack’, ‘EXTERNAL_TYPE’: ‘’, ‘TIMESTAMPS’: ‘2024-05-07 12:23:52+00:00’, ‘MODE’: ‘hosted’, ‘DISPLAY_AS_BOT’: False, ‘PRETTY_TYPE’: ‘PNG’, ‘NAME’: ‘image.png’, ‘IS_PUBLIC’: True, ‘PREVIEW_HIGHLIGHT’: None, ‘MIMETYPE’: ‘image/png’, ‘PERMALINK_PUBLIC’: ‘https://slack-files.com/TUMKD5EGJ-F072DK5AXAQ-28a67b36e7’, ‘FILETYPE’: ‘png’, ‘EDIT_LINK’: None, ‘URL_PRIVATE’: ‘Slack’, ‘HAS_RICH_PREVIEW’: False, ‘TITLE’: ‘image.png’, ‘IS_STARRED’: False, ‘PREVIEW_IS_TRUNCATED’: None, ‘URL_PRIVATE_DOWNLOAD’: ‘Slack’, ‘PREVIEW’: None, ‘PUBLIC_URL_SHARED’: False, ‘MESSAGE_TS’: ‘1715084636.927259’, ‘PARENT_MESSAGE_TS’: ‘1715084337.906679’, ‘MESSAGE_CHANNEL_ID’: ‘CUMUWQU66’, ‘_FIVETRAN_DELETED’: False, ‘LINES_MORE’: None, ‘LINES’: None, ‘SIZE’: 1164969, ‘_FIVETRAN_SYNCED’: ‘2024-05-12 08:22:13.975000+00:00’})
for 0.13.1 though
ahh, yes, I fixed it in the meantime