Unable to ingest Tableau
[2024-07-12 10:29:12,124] INFO {datahub.ingestion.reporting.file_reporter:54} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name=â/tmp/datahub/ingest/4555cc7d-6205-4645-9d42-022a4fa969b8/ingestion_report.jsonâ mode=âwâ encoding=âUTF-8â>
[2024-07-12 10:29:12,125] INFO {datahub.cli.ingest_cli:133} - Source (tableau) report:
{âevents_producedâ: 0,
âevents_produced_per_secâ: 0,
âentitiesâ: {},
âaspectsâ: {},
âaspect_urn_samplesâ: {},
âwarningsâ: {},
âfailuresâ: {},
âsoft_deleted_stale_entitiesâ: ,
âstart_timeâ: â2024-07-12 10:29:03.632583 (8.49 seconds ago)â,
ârunning_timeâ: â8.49 secondsâ}
[2024-07-12 10:29:12,126] INFO {datahub.cli.ingest_cli:136} - Sink (datahub-rest) report:
{âtotal_records_writtenâ: 0,
ârecords_written_per_secondâ: 0,
âwarningsâ: ,
âfailuresâ: ,
âstart_timeâ: â2024-07-12 10:29:03.497124 (8.63 seconds ago)â,
âcurrent_timeâ: â2024-07-12 10:29:12.126143 (now)â,
âtotal_duration_in_secondsâ: 8.63,
âmax_threadsâ: 15,
âgms_versionâ: âv0.13.3rc1â,
âpending_requestsâ: 0}
[2024-07-12 10:29:12,551] ERROR {datahub.entrypoints:205} - Command failed: â272322c6-0191-42da-90bd-04856b3bed5bâ
Traceback (most recent call last):
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/entrypoints.pyâ, line 192, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 1157, in call
return self.main(*args, **kwargs)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 1078, in main
rv = self.invoke(ctx)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.pyâ, line 783, in invoke
return __callback(*args, **kwargs)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/telemetry/telemetry.pyâ, line 454, in wrapper
raise e
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/telemetry/telemetry.pyâ, line 403, in wrapper
res = func(*args, **kwargs)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.pyâ, line 201, in run
ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
File â/usr/local/lib/python3.10/asyncio/base_events.pyâ, line 649, in run_until_complete
return future.result()
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.pyâ, line 185, in run_ingestion_and_check_upgrade
ret = await ingestion_future
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.pyâ, line 139, in run_pipeline_to_completion
raise e
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.pyâ, line 131, in run_pipeline_to_completion
pipeline.run()
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.pyâ, line 430, in run
for wu in itertools.islice(
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 162, in auto_stale_entity_removal
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 186, in auto_workunit_reporter
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 279, in auto_browse_path_v2
for urn, batch in _batch_workunits_by_urn(stream):
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 494, in _batch_workunits_by_urn
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 394, in auto_fix_duplicate_schema_field_paths
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 203, in auto_materialize_referenced_tags_terms
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.pyâ, line 106, in auto_status_aspect
for wu in stream:
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.pyâ, line 2801, in get_workunits_internal
self._populate_projects_registry()
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.pyâ, line 775, in _populate_projects_registry
all_project_map: Dict[str, TableauProject] = self._get_all_project()
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.pyâ, line 678, in _get_all_project
set_project_path()
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.pyâ, line 673, in set_project_path
project.path = form_path(project_id=project.id)
File â/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.pyâ, line 668, in form_path
cur_proj = all_project_map[cur_proj.parent_id]
KeyError: â272322c6-0191-42da-90bd-04856b3bed5bâ
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
- Are you using UI or CLI for ingestion?
- Which DataHub version are you using? (e.g. 0.12.0)
- What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
Weâre experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and donât forget to provide its responses a or so other folks in the Community know if youâre still looking for help on the topic!
To help you better, I need a bit more information:
- Are you using the UI or CLI for ingestion?
- Which DataHub version are you using?
- What data source(s) are you integrating with DataHub? (e.g., Tableau, BigQuery)?
From the error logs you provided, it seems like there is a KeyError
related to a missing project ID during the ingestion process. This might be due to an issue with the way projects are being fetched or processed.
Based on the data sources, there are a few known issues and potential solutions related to Tableau ingestion:
- ProjectLuid Field Issue: If youâre using a Tableau version earlier than 2022.3, there might be issues with the
projectLuid
field. This was a breaking change introduced in later versions of Tableau (https://github.com/datahub-project/datahub/issues/7575|source). - Permissions Issue: Ensure that the user account used for ingestion has the necessary permissions, such as
Site Administrator Explorer
(https://datahubspace.slack.com/archives/C029A3M079U/p1682430052.651439|source). - Elasticsearch Limitation: If youâre hitting a limit of 10,000 objects, it might be due to Elasticsearchâs default settings (https://datahubspace.slack.com/archives/C029A3M079U/p1712327114.398919|source).
- Subprojects Handling: Issues with nested subprojects can also cause ingestion failures (https://github.com/datahub-project/datahub/issues/9398|source).
Would you like more detailed steps to troubleshoot this issue, or do you have specific aspects youâd like to focus on?
Sources:
- <Issues · datahub-project/datahub · GitHub change in tableau ingestion>
- <Issues · datahub-project/datahub · GitHub ingestion error: Cannot return null for non-nullable type>
- <Slack ingestion problem>
- <Slack data ingestion error> 0 button 0 button Hint: Mention @runllm in the thread for followups.
- UI
- Datahub version - v0.13.3rc1
- Tableau
Tableau Version -> 2023.3.2 (20233.24.0112.1233)