Hello everyone
I am creating a POC using the quickstart with docker (version 0.12.1). I am trying to ingest from dbt core (not cloud version) and I am getting the following error:
[2023-12-14 16:17:23,930] ERROR {datahub.entrypoints:186} - Command failed: Failed to find a registered source for type dbt: dbt is disabled; try running: pip install 'acryl-datahub[dbt]'
Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 126, in _ensure_not_lazy
plugin_class = import_path(path)
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 56, in import_path
item = importlib.import_module(module_name)
File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/source/dbt/dbt_core.py", line 12, in <module>
from datahub.configuration.git import GitReference
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/configuration/git.py", line 9, in <module>
from datahub.ingestion.source.git.git_import import GitClone
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/source/git/git_import.py", line 8, in <module>
import git
ModuleNotFoundError: No module named 'git'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 120, in _add_init_error_context
yield
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 223, in __init__
source_class = source_registry.get(source_type)
File "/tmp/datahub/ingest/venv-dbt-f3c1b67e57e34548/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 176, in get
raise ConfigurationError(
datahub.configuration.common.ConfigurationError: dbt is disabled; try running: pip install 'acryl-datahub[dbt]'```
Do I need to install git in one of the images? Or run this `pip install 'acryl-datahub[dbt]'` as stated in the stacktrace? In which of the images should I execute that command or do I need a new Dockerfile for this? Thanks!