Fixing Kafka Message Size Issue

Original Slack Thread

Hello there any tips how to fix that?

Sink (datahub-rest) report:
{‘total_records_written’: 36241,
‘records_written_per_second’: 3,
‘warnings’: ,
‘failures’: [{‘error’: 'Unable to emit metadata to DataHub GMS: java.lang.RuntimeException: java.util.concurrent.ExecutionException: ’
'org.apache.kafka.common.errors.RecordTooLargeException: The message is 1296474 bytes when serialized which is larger than ’
tel:1048576|1048576, which is the value of the max.request.size configuration.’,
‘info’: {‘exceptionClass’: ‘com.linkedin.restli.server.RestLiServiceException’,
‘message’: 'java.lang.RuntimeException: java.util.concurrent.ExecutionException: ’
'org.apache.kafka.common.errors.RecordTooLargeException: The message is 1296474 bytes when serialized which is ’
‘larger than tel:1048576|1048576, which is the value of the max.request.size configuration.’,
‘status’: 500,
‘id’: ‘urn:li:dataset:(urn:li:dataPlatform:mssql,CIGAM.dbo.GFHISOPB,PROD)’}}],
‘start_time’: ‘2023-10-12 23:46:47.724130 (2 hours, 59 minutes and 13.03 seconds ago)’,
‘current_time’: ‘2023-10-13 02:46:00.756905 (now)’,
‘total_duration_in_seconds’: 10753.03,
‘gms_version’: ‘null’,
‘pending_requests’: 0}

Pipeline finished with at least 1 failures; produced 36242 events in 2 hours, 59 minutes and 12.77 seconds.

<@U054JJ71DKL> It is kafka message size issue. you might need to debug and increase the size of message

some detail is available here https://www.conduktor.io/kafka/how-to-send-large-messages-in-apache-kafka/

Thanks

I’m sorry. I couldnt find that property

I`m in the rigth place ?attachment

<@U0348BYAS56> :pray:

<@U054JJ71DKL> I am also not aware about property location. <@UV14447EU> do you have any idea ?

I think here <@U03MF8MU5P0> is the one who might can help.

I have a work in progress PR for configuring kafka message sizes. Currently something is broken with the CI tests, I haven’t had time to debug further. Once I get these in place, it should cover all the configuration. DH Project: https://github.com/datahub-project/datahub/pull/9038 Helm: https://github.com/acryldata/datahub-helm/pull/383