Building Lineage Container to Dataset in S3 Case

Original Slack Thread

Hi guys! I couldn’t find any information anywhere on how to build a lineage container2dataset or vice versa. Is it even possible? What to do in the S3 case, when the top-level dataset is represented as a container?

Dataset has container aspect. you can set it to container (,Dataset,PROD)/Schema?is_lineage_mode=false&amp;schemaFilter=)

Thx, Siddique!

It is clear that each dataset can be an IsPartOf of the container. The question is different. If I have a spark/flink service that consumes data from a s3 folder (container), which contains partition subfolders (years, months, days, hours - also containers), and only below are dataset files. In this case, how to correctly build a lineage container-dataset? How to calculate container urn? Now it represents like urn:li:container:8aedea34fd2377eae316eca5464c2034attachment

Hi just wondering if you found the solution to container2dataset lineage? I have the same use case and didn’t find anything from the doc. <@U05A57K96F2>

Unfortunately, not yet.