Error Fix for Datahub v0.12.1 Ingestion Failure with Fivetran Metadata

Original Slack Thread

Hi, I am using datahub v0.12.1 and trying to ingest metadata from Fivetran using the log tables in snowflake. However, the ingestion fails with the following error.

  File "/tmp/datahub/ingest/venv-fivetran-", line 173, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/tmp/datahub/ingest/venv-fivetran-", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/tmp/datahub/ingest/venv-fivetran-", line 1078, in main
    rv = self.invoke(ctx)
  File "/tmp/datahub/ingest/venv-fivetran-", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/tmp/datahub/ingest/venv-fivetran-", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/tmp/datahub/ingest/venv-fivetran-", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/tmp/datahub/ingest/venv-fivetran-", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/tmp/datahub/ingest/venv-fivetran-", line 448, in wrapper
    raise e
  File "/tmp/datahub/ingest/venv-fivetran-", line 397, in wrapper
    res = func(*args, **kwargs)
  File "/tmp/datahub/ingest/venv-fivetran-", line 206, in run
    ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
  File "/usr/local/lib/python3.10/asyncio/", line 649, in run_until_complete
    return future.result()
  File "/tmp/datahub/ingest/venv-fivetran-", line 190, in run_ingestion_and_check_upgrade
    ret = await ingestion_future
  File "/tmp/datahub/ingest/venv-fivetran-", line 144, in run_pipeline_to_completion
    raise e
  File "/tmp/datahub/ingest/venv-fivetran-", line 136, in run_pipeline_to_completion
  File "/tmp/datahub/ingest/venv-fivetran-", line 381, in run
    for wu in itertools.islice(
  File "/tmp/datahub/ingest/venv-fivetran-", line 127, in auto_stale_entity_removal
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 151, in auto_workunit_reporter
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 207, in re_emit_browse_path_v2
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 248, in auto_browse_path_v2
    for urn, batch in _batch_workunits_by_urn(stream):
  File "/tmp/datahub/ingest/venv-fivetran-", line 386, in _batch_workunits_by_urn
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 164, in auto_materialize_referenced_tags
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 71, in auto_status_aspect
    for wu in stream:
  File "/tmp/datahub/ingest/venv-fivetran-", line 280, in get_workunits_internal
    connectors = self.audit_log.get_connectors_list()
  File "/tmp/datahub/ingest/venv-fivetran-", line 138, in get_connectors_list
  File "/tmp/datahub/ingest/venv-fivetran-", line 121, in _get_user_name
    user_details = self._query(FivetranLogQuery.get_user_query(user_id=user_id))[0]
IndexError: list index out of range```

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

looking at the query logs, I see this as the last query in snowflake before the connector errors out

        given_name as "GIVEN_NAME",
        family_name as "FAMILY_NAME"
        FROM USER
        WHERE id = 'None'```

I am using the UI for ingestion

thanks, I think I see what is the issue and will provide a fix soon

Hi I think we’re running into this issue as well - has the fix been implemented?

yes, I think it was merged and it should be fixed with the latest cli