<@U06TM7M40RK> in tableau ingestion I see this message -
Fixing duplicate field paths in schema aspect for urn by dropping fields:
What is the meaning of this message and why is it dropping fields from schema ?
The message you are seeing, “Fixing duplicate field paths in schema aspect for urn by dropping fields,” indicates that during the ingestion process, there were duplicate field paths detected in the schema metadata. This message is part of a mechanism to handle such duplicates by dropping the redundant fields to prevent issues in the DataHub UI and ensure data integrity.
According to the <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py|source_helpers.py file>, the function auto_fix_duplicate_schema_field_paths
is responsible for identifying and removing duplicate field paths in schema metadata. This is done to prevent UI bugs that can occur when duplicate field paths are present, as the UI may not render schemas correctly if field paths are not unique.
The process involves:
- Iterating over the schema metadata aspects.
- Checking for duplicate field paths.
- Dropping the duplicate fields while retaining the first occurrence to maintain a unique set of field paths.
This approach is necessary because duplicate field paths can lead to rendering issues in the UI, as noted in the <https://github.com/datahub-project/datahub/pull/10590|pull request> that introduced this feature. The duplicates are dropped to ensure that each field path is unique, which is crucial for the correct functioning of the DataHub platform.
Would you like more detailed information on how this process is implemented or any other aspect of this feature?
Sources:
- https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py|source_helpers.py
- <https://github.com/datahub-project/datahub/pull/10590|Pull Request: feat(fieldpaths): prevent duplicate field paths> 0 button 0 button Hint: Mention @runllm in the thread for followups.
Hi ! Did this interaction resolve your issue or do you still require assistance ?
Please tag @ RunLLM if you still need further assistance !
Or tag @Datahub Community Support if you require a person to take a look
Hope this helps ! ><