‘AvroException’ not found in ‘avro.schema’ module during metadata ingestion
Description:
I encountered an error while using the DataHub metadata-ingestion framework (version 0.14.0.2). The pipeline execution fails due to an AttributeError, specifically that the ‘avro.schema’ module has no attribute ‘AvroException’.
Steps to Reproduce:
- Set up a metadata ingestion pipeline using the File source.
- Attempt to run the pipeline with a JSON file containing metadata.
- The pipeline fails with the following error:
Error Message:
PipelineExecutionError: (‘Source reported errors’, FileSourceReport(…))
Traceback:
Traceback (most recent call last):
File “…/batch_pipeline_ingest.py”, line 108, in <module>
run_pipeline(config)
File “…/batch_pipeline_ingest.py”, line 103, in run_pipeline
pipeline.raise_from_status()
File “…/datahub/ingestion/run/pipeline.py”, line 594, in raise_from_status
raise PipelineExecutionError(
datahub.configuration.common.PipelineExecutionError: (‘Source reported errors’, FileSourceReport(…))
The FileSourceReport contains multiple entries with the same error:
“module ‘avro.schema’ has no attribute ‘AvroException’”
Environment:
- Operating System: WSL2(ubuntu-22.04)
- Python Version: 3.10
- DataHub Version: 0.14.0.2
- Relevant package versions:
- avro: [version]
- [any other relevant packages and their versions]
Expected Behavior:
The pipeline should successfully process the metadata JSON file without raising an AttributeError related to ‘AvroException’.
Actual Behavior:
The pipeline fails with an AttributeError, stating that the ‘avro.schema’ module has no attribute ‘AvroException’.
Additional Context:
This error occurs consistently across multiple runs and affects the processing of various metadata entries in the JSON file.
Possible Related Issues:
- Is there a version mismatch between the avro library and the version expected by DataHub?
- Has there been a recent change in the avro library that might have removed or renamed the ‘AvroException’?
I would appreciate any insights or suggestions on how to resolve this issue. Let me know if you need any additional information or if there are any specific diagnostic steps I should take.