Hi <@U06TM7M40RK>. We’re running ingestion to datahub from Snowflake. If we’re running ingestion with CLI version: 0.12.1.5
everything runs fine. But since version 0.13.0 it is failing.
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO -
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - Cli report:
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - {'cli_version': '0.14.0.4',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'cli_entry_location': '/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/__init__.py',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'models_version': 'bundled',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'py_version': '3.12.5 (main, Aug 20 2024, 10:24:55) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)]',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'py_exec_path': '/data01/airflow/.pyenv/versions/py3_datahub/bin/python',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'os_details': 'Linux-4.18.0-553.8.1.el8_10.x86_64-x86_64-with-glibc2.28',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'mem_info': '1.27 GB',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'peak_memory_usage': '1.27 GB',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'disk_info': {'total': '12.87 GB', 'used': '4.22 GB', 'used_initally': '4.22 GB', 'free': '8.65 GB'},
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'peak_disk_usage': '4.22 GB',
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'thread_count': 3,
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - 'peak_thread_count': 18}
[2024-09-05, 12:32:45 CEST] {subprocess.py:93} INFO - Source (snowflake) report:
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - [2024-09-05 12:32:46,206] ERROR {datahub.entrypoints:218} - Command failed: too many SQL variables
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - Traceback (most recent call last):
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/entrypoints.py", line 205, in main
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - sys.exit(datahub(standalone_mode=False, **kwargs))
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return self.main(*args, **kwargs)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 1078, in main
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - rv = self.invoke(ctx)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return ctx.invoke(self.callback, **ctx.params)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/click/core.py", line 783, in invoke
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return __callback(*args, **kwargs)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/telemetry/telemetry.py", line 462, in wrapper
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - raise e
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/telemetry/telemetry.py", line 411, in wrapper
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - res = func(*args, **kwargs)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/cli/ingest_cli.py", line 203, in run
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/3.12.5/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return future.result()
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/cli/ingest_cli.py", line 187, in run_ingestion_and_check_upgrade
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ret = await ingestion_future
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/cli/ingest_cli.py", line 145, in run_pipeline_to_completion
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ret = pipeline.pretty_print_summary(warnings_as_failure=strict_warnings)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/run/pipeline.py", line 678, in pretty_print_summary
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - click.echo(self.source.get_report().as_string())
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/api/report.py", line 83, in as_string
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return pprint.pformat(self.as_obj(), width=150, sort_dicts=False)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/api/source.py", line 321, in as_obj
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - **super().as_obj(),
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/api/report.py", line 76, in as_obj
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - str(key): Report.to_pure_python_obj(value)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/api/report.py", line 42, in to_pure_python_obj
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return some_val.as_obj()
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/ingestion/api/report.py", line 74, in as_obj
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - self.compute_stats()
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/sql_parsing/sql_parsing_aggregator.py", line 247, in compute_stats
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - self.num_unique_query_fingerprints = len(self._aggregator._query_map)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/utilities/file_backed_collections.py", line 388, in __len__
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - cursor = self._conn.execute(
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - File "/data01/airflow/.pyenv/versions/py3_datahub/lib/python3.12/site-packages/datahub/utilities/file_backed_collections.py", line 117, in execute
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - return self.conn.execute(sql, parameters)
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2024-09-05, 12:32:46 CEST] {subprocess.py:93} INFO - sqlite3.OperationalError: too many SQL variables
[2024-09-05, 12:32:52 CEST] {subprocess.py:97} INFO - Command exited with return code 1```
What could be causing this?