Discussion on Handling 'stl_insert' Permission Denied Error in Redshift Serverless

Original Slack Thread

Hi everyone! :hihi: I tried to find some solution in this channel, however it seems that on a newer version redshift-legacy module has been deprecated… Any way to workaround the permission denied error regarding stl_insert for Redshift Serverless please?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)
  2. Please post any relevant error logs on the thread!

I am creating a POC right now, and I am using version 0.12.0.1 where in redshift-legacy is still available… but we would like to use a newer version…

I think I’m facing same issue right now.
Some of our data source owners migrated from provisioned Redshift to Redshift Serverless and we are having issues with the redshift connector complaining about stl_insert.
AFAIK the problem is not about permissions but Redshift Serverless missing or having a different name for some of the tables being queried by the crawler (deprecated and non deprecated one), such as stl_insert, among others.
<@UV14447EU> do you know if there is any ongoing plan to support Redshift Serverless? (asking you as one of the contributors to the redshift connector)

I need to dig a bit deeper in this to understand how we can get audit history from Redshift spectrum. Unfortunately some of the information schema tables which we used don’t work with Serverless

Definitely we would like to support Serverless

It seems we should use SYS tables which can work on both. The only issue I know with sys tables is you can’t get the original query from it because they remove line breaks and a line break can confuse our query parser.
I hope they will fix it as they state that they keep the line breaks but we saw it differently

I will create a ticket about this

https://docs.aws.amazon.com/redshift/latest/dg/sys_view_migration.html

When you migrate your Amazon Redshift provisioned cluster to Amazon Redshift Serverless, your monitoring or diagnostic queries might reference system views that are only available on provisioned clusters. You can update your queries to use the SYS monitoring views.
yes, that’s Redshift suggestion also :+1:

Please, could you share ticket here so we can keep an eye on it? :slightly_smiling_face:

Thank you for the feedback! It would be easier for us to use the latest redshift connecter than using the legacy module. :pray:

Hey <@UV14447EU>, quick question - is adding Redshift Serverless support something your team is working on already? If not, my team could contribute with it, since we really need that capability soon. Just wanna check so we don’t step on each other’s toes!

Not working on it right now.

If you can work on it I’m happy to help and it can speed up to release it

That’s great! We’ll let you know once we start working on it! :+1: Thanks!

Sure, I’m happy to meet with you if you need help. A month ago I was working on another Redshift feature (improving the column/table level lineage), and there I was about to migrate to the new views, but then I gave up because:

  1. You can’t join an old information schema table with a new one with querying because even though the name is the same, the id is different in the new and the old one.
  2. The main issue was not being able to get the original SQL query with line breaks with the new information schema table -> even though STL_QUERYTEXT states that it stores it, in my experience, it didn’t contain it.

So new views does not fully replace old system tables in the case of provisioned Redshift? both schemas need to coexist?

yes, that was my main problem :disappointed:

And of course, they doesn’t replace exactly the old tables, but that can be fixed by tweaking the queries

that means we need to keep sort of two code paths (and/or full of “ifs”): one for provisioned one for serverless
what about having a separate connector? would that make future maintenance easier? :thinking_face:

What I would do is to double-check if STL_QUERYTEXT can return the original query with linebreaks and if we can get that then we can use the new views