Issue with Tableau Ingestion KeyError: '272322c6-0191-42da-90bd-04856b3bed5b'

Original Slack Thread

:thread:Unable to ingest Tableau

[2024-07-12 10:29:12,124] INFO {datahub.ingestion.reporting.file_reporter:54} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name=‘/tmp/datahub/ingest/4555cc7d-6205-4645-9d42-022a4fa969b8/ingestion_report.json’ mode=‘w’ encoding=‘UTF-8’>
[2024-07-12 10:29:12,125] INFO {datahub.cli.ingest_cli:133} - Source (tableau) report:
{‘events_produced’: 0,
‘events_produced_per_sec’: 0,
‘entities’: {},
‘aspects’: {},
‘aspect_urn_samples’: {},
‘warnings’: {},
‘failures’: {},
‘soft_deleted_stale_entities’: ,
‘start_time’: ‘2024-07-12 10:29:03.632583 (8.49 seconds ago)’,
‘running_time’: ‘8.49 seconds’}
[2024-07-12 10:29:12,126] INFO {datahub.cli.ingest_cli:136} - Sink (datahub-rest) report:
{‘total_records_written’: 0,
‘records_written_per_second’: 0,
‘warnings’: ,
‘failures’: ,
‘start_time’: ‘2024-07-12 10:29:03.497124 (8.63 seconds ago)’,
‘current_time’: ‘2024-07-12 10:29:12.126143 (now)’,
‘total_duration_in_seconds’: 8.63,
‘max_threads’: 15,
‘gms_version’: ‘v0.13.3rc1’,
‘pending_requests’: 0}
[2024-07-12 10:29:12,551] ERROR {datahub.entrypoints:205} - Command failed: ‘272322c6-0191-42da-90bd-04856b3bed5b’
Traceback (most recent call last):
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/entrypoints.py”, line 192, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 1157, in call
return self.main(*args, **kwargs)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 1078, in main
rv = self.invoke(ctx)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/click/core.py”, line 783, in invoke
return __callback(*args, **kwargs)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/telemetry/telemetry.py”, line 454, in wrapper
raise e
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/telemetry/telemetry.py”, line 403, in wrapper
res = func(*args, **kwargs)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.py”, line 201, in run
ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
File “/usr/local/lib/python3.10/asyncio/base_events.py”, line 649, in run_until_complete
return future.result()
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.py”, line 185, in run_ingestion_and_check_upgrade
ret = await ingestion_future
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.py”, line 139, in run_pipeline_to_completion
raise e
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/cli/ingest_cli.py”, line 131, in run_pipeline_to_completion
pipeline.run()
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py”, line 430, in run
for wu in itertools.islice(
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 162, in auto_stale_entity_removal
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 186, in auto_workunit_reporter
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 279, in auto_browse_path_v2
for urn, batch in _batch_workunits_by_urn(stream):
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 494, in _batch_workunits_by_urn
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 394, in auto_fix_duplicate_schema_field_paths
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 203, in auto_materialize_referenced_tags_terms
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py”, line 106, in auto_status_aspect
for wu in stream:
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py”, line 2801, in get_workunits_internal
self._populate_projects_registry()
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py”, line 775, in _populate_projects_registry
all_project_map: Dict[str, TableauProject] = self._get_all_project()
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py”, line 678, in _get_all_project
set_project_path()
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py”, line 673, in set_project_path
project.path = form_path(project_id=project.id)
File “/tmp/datahub/ingest/venv-tableau-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py”, line 668, in form_path
cur_proj = all_project_map[cur_proj.parent_id]
KeyError: ‘272322c6-0191-42da-90bd-04856b3bed5b’

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

To help you better, I need a bit more information:

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using?
  3. What data source(s) are you integrating with DataHub? (e.g., Tableau, BigQuery)?

From the error logs you provided, it seems like there is a KeyError related to a missing project ID during the ingestion process. This might be due to an issue with the way projects are being fetched or processed.

Based on the data sources, there are a few known issues and potential solutions related to Tableau ingestion:

  1. ProjectLuid Field Issue: If you’re using a Tableau version earlier than 2022.3, there might be issues with the projectLuid field. This was a breaking change introduced in later versions of Tableau (https://github.com/datahub-project/datahub/issues/7575|source).
  2. Permissions Issue: Ensure that the user account used for ingestion has the necessary permissions, such as Site Administrator Explorer (https://datahubspace.slack.com/archives/C029A3M079U/p1682430052.651439|source).
  3. Elasticsearch Limitation: If you’re hitting a limit of 10,000 objects, it might be due to Elasticsearch’s default settings (https://datahubspace.slack.com/archives/C029A3M079U/p1712327114.398919|source).
  4. Subprojects Handling: Issues with nested subprojects can also cause ingestion failures (https://github.com/datahub-project/datahub/issues/9398|source).

Would you like more detailed steps to troubleshoot this issue, or do you have specific aspects you’d like to focus on?

Sources:

  1. UI
  2. Datahub version - v0.13.3rc1
  3. Tableau

Tableau Version -> 2023.3.2 (20233.24.0112.1233)