seems that datasets of map(x,map()) fail to recognize the structure
huh that’s interesting
to debug this more, we’ll need to determine if the issue is coming from the datahub integration, or from the underlying sqlalchemy-clickhouse driver
did the ingestion report any errors/warnings related to “type mappings”?
https://github.com/datahub-project/datahub/issues/9079 i opened a issue on github
'at.audit_log_local': ["unable to get column information due to an error -> Map.__init__() missing 1 required positional argument: 'value_type'"]},
i can provide more logs if you like
id Int,
metadata Map(String, Map(String, Nullable(String)))
) ENGINE = MergeTree()
order by id```
it doesn't like this type of a column
Ah so it looks like a bug in the underlying sqlalchemy-clickhouse driver
We’re calling out to the underlying driver https://github.com/datahub-project/datahub/blob/f402090c1ebec9601e5fef6e45879d3a0a015dbd/metadata-ingestion/src/datahub/ingestion/source/sql/sql_common.py#L817|here (the inspector class is a shim provided by sqlalchemy to call the driver). It seems like they’re trying to parse map types recursively here https://github.com/xzkostyan/clickhouse-sqlalchemy/blob/a3140a931199d9a0b451c4efaba375fb15224b2e/clickhouse_sqlalchemy/drivers/base.py#L242, but the split on ,
doesn’t respect the ()
from the inner Map type
<@U062DGN53JL> I’ve created a github issue on that repo https://github.com/xzkostyan/clickhouse-sqlalchemy/issues/269 - feel free to add any additional relevant details on that thread, and hopefully they can resolve it quickly
For small bugs we often will monkeypatch the underlying library directly, but this is a somewhat complex bug so I’d rather just wait for it to be fixed in the underlying library
thank you!