Troubleshooting Decimal Import Issue in DataHub v0.12

Original Slack Thread


Source: Athena
Problem: Decimals can’t imported since v0.12 (also latest v0.12.0.1)

Error: Unable to parse column <column_name> and type DECIMAL the error was: int() argument must be a string, a bytes-like object or a real number,
not ‘NoneType’

Caused by:

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

Hi <@U038J2ST49F>,

I get the same error message on my end.
However the table is still displayed in the UI.
Nonetheless this is clearly a bug, just wanted to check out if it behaves the same on your end.

Has this ever worked before for you or are you trying to ingest this athena table the first time?
It seems like this issue is caused by the dependency used for retrieving table metadata from Athena

Hey <@U049WUH4155> thanks for the reproducing. In cli v0.12.0.0 the table looked like on your screenshot. The table metadata was ingested but all the decimal columns where double displayed with dot in between like yours.
I think the cli in v0.12.0.1 + datahub V0.12.0 the pipeline broke at that point and stopped ingestion but I can’t remember.
Can try it tomorrow at work again but if you also used v0.12.0.1 this could be something else.

This worked correctly before. I think in datahub V0.11 it works fine, I tried different CLI versions.

I reproduced it with
Thanks for the context!
I’ll try to come up with a fix :slightly_smiling_face:

Sneak Peak: I was able to resolve it

Giving some more background: I made a contribution to datahub 0.12.0 that enables the Athena source the handle and display complex types such as maps and structs properly. Unfortunately, I broke the type detection for decimals when passing the simple types back to the dependency (PyAthena)attachment


<@U049WUH4155> thanks, I just merge it