Hi everyone, How is the data lineage information stored in datahub? Store to database (Mysql? neo4j)? How is it reflected in the source code? I don’t know which part of the source code to look at
Hi flash. Personally, i think there are several places in the source code need to look at regading how the lineage data to be stored and processed in datahub.
• metadata-ingestion module where included all the supported sources that can be ingested into datahub, the lineage related data such as table-linage or column-lineage are processed in that part from my understanding. The core componet used in this module is SqlAIChemry which is the bases for other SQL based sources(example class is “SQLAlchemySource”).
• metadata-io module where has different implenmenation for the JAVA interface ‘GraphService’ that located under the metadata-services module. So depenting on different implenmennation, the store layer of the lineage data can be ES or Neo4j…
That’s what i knew currently . Hope this helps.
<@U05B3RG1PD1> Your answer is very helpful. Thank you very much