Troubleshooting Athena AccessDeniedException in DataHub Integration

Original Slack Thread

I’m getting an error when trying to load Athena, what’s going on?

pyathena.error.OperationalError: An error occurred (AccessDeniedException) when calling the ListDatabases operation: You are not authorized to perform: athena:ListDatabases on the resource. After your AWS administrator or you have updated your permissions, please try again.

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic!

To better assist you, could you please provide the following information:

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using?
  3. What data source(s) are you integrating with DataHub (e.g., BigQuery, Athena)?

Based on the error message AccessDeniedException when calling the ListDatabases operation, it appears that your AWS credentials do not have the necessary permissions to perform the athena:ListDatabases action.

According to the <https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/docs/sources/athena/athena_pre.md|Athena prerequisites documentation>, you need to ensure that your AWS role or credentials have the following permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "athena:GetTableMetadata", 
        "athena:StartQueryExecution", 
        "athena:GetQueryResults", 
        "athena:GetDatabase", 
        "athena:ListDataCatalogs",
        "athena:GetDataCatalog", 
        "athena:ListQueryExecutions", 
        "athena:GetWorkGroup", 
        "athena:StopQueryExecution", 
        "athena:GetQueryResultsStream", 
        "athena:ListDatabases", 
        "athena:GetQueryExecution", 
        "athena:ListTableMetadata", 
        "athena:BatchGetQueryExecution", 
        "glue:GetTables", 
        "glue:GetDatabases", 
        "glue:GetTable",
        "glue:GetDatabase",
        "glue:SearchTables",
        "glue:GetTableVersions",
        "glue:GetTableVersion",
        "glue:GetPartition", 
        "glue:GetPartitions", 
        "s3:GetObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:athena:${region-id}:${account-id}:datacatalog/*",
        "arn:aws:athena:${region-id}:${account-id}:workgroup/*",
        "arn:aws:glue:${region-id}:${account-id}:tableVersion/*/*/*",
        "arn:aws:glue:${region-id}:${account-id}:table/*/*", 
        "arn:aws:glue:${region-id}:${account-id}:catalog", 
        "arn:aws:glue:${region-id}:${account-id}:database/*", 
        "arn:aws:s3:::${datasets-bucket}",
        "arn:aws:s3:::${datasets-bucket}/*"
      ]
    }
  ]
}

Would you like more details on how to configure these permissions or any other assistance?

Sources: