Troubleshooting datahub ingress issues and solutions

Original Slack Thread

Hi everyone, I’m trying out datahub for the first time and going through the quickstart installation. I was able to have the containers running including the frontend in the browser. However, when trying to run :
datahub docker ingest-sample-data
I get this message:

Downloading sample data...
Docker is not ready:
- datahub-actions is not running

Try running `datahub docker quickstart` first.```
I installed datahub using pip in a fresh python 3.8.9  virtual environment. Afterwards, I tried running
```pip install --upgrade acryl-datahub-actions```
but still the same error. Then i looked into the logs of the datahub-actions container in docker and I found this
```ImportError: cannot import name 'resolve_element' from 'datahub.configuration.config_loader' (/usr/local/lib/python3.10/site-packages/datahub/configuration/config_loader.py)```
Appreciate any help here. Thanks!

I see the same error. Initially, I ran the quickstart using 0.11.0 last Friday with no issues. After removing my containers and images and uninstalling/installing acryl-datahub and rerunning the quickstart this morning, I have this same error. I am using python 3.9.4.

I have the same error. 2 Days ago it was still working.

Same issue for me. Using python 3.9.4 - docker desktop for mac 4.23.0 - compose v2.21.0-desktop.1

I solved it with pinning the version in my compose file
I changed this and it works for me now that I have pinned the version:

    image: ${DATAHUB_ACTIONS_IMAGE:-acryldata/datahub-actions}:v0.0.13```

Thanks <@U05SYDY5D4K> - it solved my issue as well, now all containers are healthy. However, even with successful ingestion from cli, I can’t see anything created from the UI.
I’ve run datahub docker ingest-sample-data - output is below - but I’ve got an empty UI. Also tried to ingest some json schemas, same outcome. Are you able to see any data at all on your end?

 'events_produced_per_sec': 22,
 'entities': {'corpuser': ['urn:li:corpuser:datahub', 'urn:li:corpuser:jdoe'],
              'corpGroup': ['urn:li:corpGroup:jdoe', 'urn:li:corpGroup:bfoo'],
              'dataset': ['urn:li:dataset:(urn:li:dataPlatform:kafka,SampleKafkaDataset,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:hdfs,SampleHdfsDataset,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:hive,logging_events,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:hive,fct_users_created,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:hive,fct_users_deleted,PROD)',
                          'urn:li:dataset:(urn:li:dataPlatform:s3,project/root/events/logging_events_bckp,PROD)'],
              'dataJob': ['urn:li:dataJob:(urn:li:dataFlow:(airflow,dag_abc,PROD),task_123)',
                          'urn:li:dataJob:(urn:li:dataFlow:(airflow,dag_abc,PROD),task_456)'],
              'dataFlow': ['urn:li:dataFlow:(airflow,dag_abc,PROD)'],
              'chart': ['urn:li:chart:(looker,baz1)', 'urn:li:chart:(looker,baz2)'],
              'dashboard': ['urn:li:dashboard:(looker,baz)'],
              'mlModel': ['urn:li:mlModel:(urn:li:dataPlatform:science,scienceModel,PROD)'],
              'tag': ['urn:li:tag:Legacy', 'urn:li:tag:NeedsDocumentation'],
              'dataPlatform': ['urn:li:dataPlatform:couchbase',
                               'urn:li:dataPlatform:mongodb',
                               'urn:li:dataPlatform:pinot',
                               'urn:li:dataPlatform:presto',
                               'urn:li:dataPlatform:snowflake',
                               'urn:li:dataPlatform:redshift',
                               'urn:li:dataPlatform:mssql',
                               'urn:li:dataPlatform:druid',
                               'urn:li:dataPlatform:looker',
                               'urn:li:dataPlatform:sagemaker',
                               '... sampled of 27 total elements'],
              'mlPrimaryKey': ['urn:li:mlPrimaryKey:(test_feature_table_all_feature_dtypes,dummy_entity_1)',
                               'urn:li:mlPrimaryKey:(test_feature_table_all_feature_dtypes,dummy_entity_2)',
                               'urn:li:mlPrimaryKey:(test_feature_table_no_labels,dummy_entity_2)',
                               'urn:li:mlPrimaryKey:(test_feature_table_single_feature,dummy_entity_1)',
                               'urn:li:mlPrimaryKey:(user_features,user_name)',
                               'urn:li:mlPrimaryKey:(user_features,user_id)',
                               'urn:li:mlPrimaryKey:(user_analytics,user_name)'],
              'mlFeature': ['urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_BOOL_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_BYTES_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_DOUBLE_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_DOUBLE_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_FLOAT_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_INT32_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_INT64_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_STRING_LIST_feature)',
                            'urn:li:mlFeature:(test_feature_table_all_feature_dtypes,test_STRING_feature)',
                            'urn:li:mlFeature:(user_features,number_of_visits)',
                            '... sampled of 20 total elements'],
              'mlFeatureTable': ['urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,test_feature_table_all_feature_dtypes)',
                                 'urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,test_feature_table_no_labels)',
                                 'urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,test_feature_table_single_feature)',
                                 'urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,user_features)',
                                 'urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,user_analytics)'],
              'glossaryTerm': ['urn:li:glossaryTerm:CustomerAccount', 'urn:li:glossaryTerm:SavingAccount', 'urn:li:glossaryTerm:AccountBalance'],
              'glossaryNode': ['urn:li:glossaryNode:ClientsAndAccounts'],
              'container': ['urn:li:container:DATABASE', 'urn:li:container:SCHEMA'],
              'assertion': ['urn:li:assertion:358c683782c93c2fc2bd4bdd4fdb0153'],
              'query': ['urn:li:query:test-query']},
 'aspects': {'corpuser': {'corpUserInfo': 2, 'corpUserStatus': 1, 'status': 2},
             'corpGroup': {'corpGroupInfo': 2, 'status': 2},
             'dataset': {'browsePaths': 2,
                         'datasetProperties': 5,
                         'ownership': 7,
                         'institutionalMemory': 6,
                         'schemaMetadata': 7,
                         'status': 7,
                         'upstreamLineage': 6,
                         'editableSchemaMetadata': 1,
                         'globalTags': 1,
                         'datasetProfile': 2,
                         'operation': 2,
                         'datasetUsageStatistics': 1,
                         'container': 1},
             'dataJob': {'status': 2, 'ownership': 2, 'dataJobInfo': 2, 'dataJobInputOutput': 2},
             'dataFlow': {'status': 1, 'ownership': 1, 'dataFlowInfo': 1},
             'chart': {'status': 2, 'chartInfo': 2, 'globalTags': 1},
             'dashboard': {'status': 1, 'ownership': 1, 'dashboardInfo': 1},
             'mlModel': {'ownership': 1,
                         'mlModelProperties': 1,
                         'mlModelTrainingData': 1,
                         'mlModelEvaluationData': 1,
                         'institutionalMemory': 1,
                         'intendedUse': 1,
                         'mlModelMetrics': 1,
                         'mlModelEthicalConsiderations': 1,
                         'mlModelCaveatsAndRecommendations': 1,
                         'status': 1,
                         'cost': 1},
             'tag': {'status': 2, 'tagProperties': 2, 'ownership': 2},
             'dataPlatform': {'dataPlatformInfo': 27},
             'mlPrimaryKey': {'status': 7, 'mlPrimaryKeyProperties': 7},
             'mlFeature': {'status': 20, 'mlFeatureProperties': 20},
             'mlFeatureTable': {'status': 5, 'browsePaths': 5, 'mlFeatureTableProperties': 5},
             'glossaryTerm': {'status': 3, 'glossaryTermInfo': 3, 'ownership': 3},
             'glossaryNode': {'glossaryNodeInfo': 1, 'ownership': 1, 'status': 1},
             'container': {'containerProperties': 2, 'subTypes': 2, 'dataPlatformInstance': 2, 'container': 1},
             'assertion': {'assertionInfo': 1, 'dataPlatformInstance': 1, 'assertionRunEvent': 1},
             'query': {'queryProperties': 1, 'querySubjects': 1}},
 'warnings': {},
 'failures': {},
 'total_num_files': 1,
 'num_files_completed': 1,
 'files_completed': ['/var/folders/t9/8_8l91116gl919_sgsdx63qm0000gn/T/tmp_zre41ps.json'],
 'percentage_completion': '0%',
 'estimated_time_to_completion_in_minutes': -1,
 'total_bytes_read_completed_files': 120035,
 'current_file_size': 120035,
 'total_parse_time_in_seconds': 0.0,
 'total_count_time_in_seconds': 0.0,
 'total_deserialize_time_in_seconds': 0.0,
 'start_time': '2023-09-20 14:12:43.377711 (4.48 seconds ago)',
 'running_time': '4.47 seconds'}
Sink (datahub-rest) report:
{'total_records_written': 101,
 'records_written_per_second': 22,
 'warnings': [],
 'failures': [],
 'start_time': '2023-09-20 14:12:43.349866 (4.51 seconds ago)',
 'current_time': '2023-09-20 14:12:47.861194 (now)',
 'total_duration_in_seconds': 4.51,
 'gms_version': 'null',
 'pending_requests': 0}```

I rerun quickstart just now (without any change). The data has been ingested successfully and showed in UI.

<@U05RUR2CQA1> I guess it is because the normal quickstart without --version always pulls the latest head you have the change in the head tag on the docker image. That is also the reason that it bugs sometimes. Depending on when each image is pushed to the registry tagged with head.
I pinned my version so I have a reproducible setup

You are right. It is recommended to pin the version for production use. Btw, I have scheduled ingestion task. It doesn’t work. It works well last week. Do you have the same problem?

I have not yet set up any data at all. I only did my setup to get my policies from keycloak OIDC.