Hi <@U06TM7M40RK> i’m getting this warning when ingesting mongodb in datahub v0.14.0:
An unexpected issue occurred
Unrecognized column types found: field type <class ‘bytes’>
The warning message “Unrecognized column types found: field type <class ‘bytes’>” indicates that the MongoDB ingestion process encountered a field type that it does not recognize or handle properly. This issue can arise due to various reasons, such as unsupported data types or schema inference limitations.
To address this issue, you can try the following steps:
-
Check Data Types: Ensure that the data types in your MongoDB collections are supported by DataHub’s ingestion framework. The
bytes
type might not be directly supported, and you may need to convert it to a supported type likestring
orbinary
. -
Schema Inference Configuration: You can adjust the schema inference settings in your MongoDB ingestion configuration to better handle the data types. For example, you can enable or disable schema inference, use random sampling, or set a maximum schema size.
-
Custom Schema Handling: If the default schema inference does not work for your use case, you might need to implement custom logic to handle specific data types. This can involve writing custom ingestion code or modifying the existing ingestion source code.
Here is an example configuration for MongoDB ingestion with schema inference settings:
source:
type: mongodb
config:
connect_uri: 'mongodb://<username>:<password>@<host>:<port>'
username: '<your-username>'
password: '<your-password>'
authMechanism: 'DEFAULT'
enableSchemaInference: true
useRandomSampling: true
maxSchemaSize: 300
ssl: true
ssl_ca_certs: '/path/to/ca.pem'
ssl_certfile: '/path/to/client.pem'
ssl_keyfile: '/path/to/client-key.pem'
If you continue to face issues, you may need to look into the specific error logs and adjust the configuration accordingly.
Would you like more detailed steps on how to troubleshoot or fix this issue?
Sources:
- <https://forum.datahubproject.io/t/understanding-and-resolving-errors-in-ingesting-data-from-mongodb/933|Understanding and resolving errors in ingesting data from MongoDB> 0 button 0 button Hint: Mention @runllm in the thread for followups.