Troubleshooting issues with recognizing the structure of datasets in DataHub and underlying clickhouse driver

Original Slack Thread

seems that datasets of map(x,map()) fail to recognize the structure

huh that’s interesting

to debug this more, we’ll need to determine if the issue is coming from the datahub integration, or from the underlying sqlalchemy-clickhouse driver

did the ingestion report any errors/warnings related to “type mappings”? i opened a issue on github

'at.audit_log_local': ["unable to get column information due to an error -> Map.__init__() missing 1 required positional argument: 'value_type'"]},

i can provide more logs if you like

	id Int,
	metadata Map(String, Map(String, Nullable(String)))
) ENGINE = MergeTree()
order by id```
it doesn't like this type of a column

Ah so it looks like a bug in the underlying sqlalchemy-clickhouse driver

We’re calling out to the underlying driver|here (the inspector class is a shim provided by sqlalchemy to call the driver). It seems like they’re trying to parse map types recursively here, but the split on , doesn’t respect the () from the inner Map type

<@U062DGN53JL> I’ve created a github issue on that repo - feel free to add any additional relevant details on that thread, and hopefully they can resolve it quickly

For small bugs we often will monkeypatch the underlying library directly, but this is a somewhat complex bug so I’d rather just wait for it to be fixed in the underlying library

thank you!