in spark lineage beta(acryl-spark-lineage-0.2.9), spark.datahub.metadata.table.hive_platform_alias is not working. There is it in docs. but I think there is not it in code.
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a
or
so other folks in the Community know if you’re still looking for help on the topic!
To assist you effectively, I need a bit more information:
- Are you using the UI or CLI for ingestion?
- Which version of DataHub are you using? (e.g., 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g., BigQuery)
Based on your message, it seems like you are encountering issues with the spark.datahub.metadata.table.hive_platform_alias
configuration in the Spark Lineage Beta (acryl-spark-lineage-0.2.9). This configuration is documented but may not be functioning as expected in the code.
According to the https://github.com/datahub-project/datahub/blob/master/metadata-integration/java/spark-lineage-beta/README.md|documentation, the spark.datahub.metadata.table.hive_platform_alias
configuration is intended to allow users to set the platform alias for Hive tables, such as switching to Glue if using AWS Glue as the Hive metastore. However, if this configuration is not working in your setup, it might be due to discrepancies between the documentation and the actual implementation in the code.
To further investigate, you can:
-
Check the Version: Ensure that you are using a version of the Spark Lineage Beta that supports this configuration. Sometimes, features documented may not be available in all versions.
-
Review the Code: Look into the source code of the acryl-spark-lineage-0.2.9 to verify if the
spark.datahub.metadata.table.hive_platform_alias
is implemented. You can find the relevant codebase on https://github.com/datahub-project/datahub|GitHub. -
Community and Support: Engage with the DataHub community on Slack or GitHub Issues for more immediate support. There may be ongoing discussions or patches available that address this issue.
Would you like more detailed guidance on any of these steps or further assistance with a specific aspect of your setup?
Sources:
- https://github.com/datahub-project/datahub/blob/master/metadata-integration/java/spark-lineage-beta/README.md|documentation
- https://github.com/datahub-project/datahub|GitHub
0 button
0 button
Hint: Mention @runllm in the thread for followups.
https://central.sonatype.com/artifact/io.acryl/acryl-spark-lineage/0.2.9/overview|release is different from master code.
release version is making hive lineage, but building master code is making s3 lineage.
in source-jar, hive platform alias is camel case not snake case.
in https://datahubproject.io/docs/metadata-integration/java/spark-lineage-beta|docs, spark.datahub.metadata.table.hive_platform_alias
but spark.datahub.metadata.dataset.hivePlatformAlias
is working.
but I don’t know why code is different with release.