Resolving SSL Certificate Error in Python Kafka Emitter for Schema Registry

Original Slack Thread

Hi team
I am using Kafka emitter, and we have setup authentication for our Kafka client environment and schema registry (local) is pointing to https
And we write our action pipeline from which we passing our configuration.

This my emitter:

For Kafka it’s able to authenticate and connect to client Kafka but for schema registry it is showing Ssl certificate error. Please refer the action config that we have as well

Any update on this ?

<@U03MF8MU5P0>

I am afraid I don’t fully understand the context here. Let me see if I can try to focus in on the problem, there are some configuration snippets. First I am guessing you mean the python emitter, not a java one, based on the first screenshot. So this https://datahubproject.io/docs/metadata-ingestion/as-a-library#kafka-emitter|doc on the python emitter. That seems to be created using that other data in the second screenshot. What is the error specifically? Is it an invalid cert or a protocol mismatch? In the schema registry url, did you remove the protocol? Why is there a url path? When you say schema registry local what do you mean by that? Do you mean using GMS as the schema registry with an INTERNAL configuration? If using the INTERNAL configuration, how did you implement the SSL encryption? What is doing the decoding and what is managing the certs there?

Yes we are using python emitter and in kafka configuration we are have setup basic authentication for kafka ( for external communication) and in producer config we are providing required configuration for this in order to connect to Kafka deployed in other cluster.
For schema registry we haven’t setup any authentication it’s only on https ( using internal schema registry gms) and in schema registry config we are passing ssl.ca.location but it’s unable to verify certs and it’s working when the schema registry url is on http.
Error :
Http local issueers unable to verify certs
And all these configuration coming from my action Pipeline config section.

Is it a correct way to passing ca chain as for Kafka it’s working but for schema it is throwing error

<@UV14447EU>

Is this a custom action?
What do you do with that config in your action?

We have custom action that will convert MCL into MCP and we are using Kafka emitter to push those MCP into client environment which is being replicated there via mirror maker.
In the action config we are passing the required keys like bootstrap url, ca location , username and password for basic auth.

And we are able to connect to Kafka of client environment but with schema registry( internal gms) it’s not working on https( ssl certificate verification failed: local issuers) but working on http
And we have the provided configuration as shown in above picture

Based on the code we pass in everything to the Confluent’s Schema Registry client whatever you set in schema_registry_config -> https://github.com/datahub-project/datahub/blob/ddcd5109dcbe01aac28347cf34221d65cb5faa30/metadata-ingestion/src/datahub/emitter/kafka_emitter.py#L65

Here in the example you can see how the add schema_registry config -> https://datahubproject.io/docs/metadata-ingestion/as-a-library/#example-usage-1

I tried to understand and provide the values like this now. Is it a correct way of passing ? <@UV14447EU>

This looks fine to me

While running I am getting x509 certificate signed by unknown authority error

Can you check if the generated config looks ok?
It is hard to see from this code snippet.

Does Kafka and schema requires different certificates ?

It depends on your setup

The values are coming from action config file and I have checked the values
I am getting this error :

X509: certificate signed by unknown authority

But here the gms’ healthcheck threw the certificate issue and not the action, am I right?

or more precisely when the action tries to connect to the healthcheck