Recommended Sizes for Producer and Consumer Values in Kafka Helm Chart for a Production System

Original Slack Thread

what are the recommended sizes for the producer and consumer values for kafka in the helm chart for a production system

Hey there! :wave:

<@U06TM7M40RK> Can you help with the questions in this thread?

:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses!

To configure Kafka for a production system using the Helm chart, you need to consider several parameters for both the producer and consumer. The recommended sizes for these parameters can vary based on your specific workload and infrastructure, but here are some general guidelines:

  1. Producer Configuration:

    • SPRING_KAFKA_PRODUCER_PROPERTIES_MAX_REQUEST_SIZE: This sets the maximum size of a request that the producer can send. A common setting is 5MB (5242880 bytes), but this can be increased if your messages are larger. According to the <https://datahubproject.io/docs/0.14.0/deploy/environment-vars#kafka|DataHub environment variables documentation>, the default is 1MB (1048576 bytes), but it can be increased to accommodate larger messages.
    • KAFKA_PRODUCER_MAX_REQUEST_SIZE: Similar to the above, this sets the maximum request size for the Kafka producer. The default is also 5MB (5242880 bytes).
  2. Consumer Configuration:

    • SPRING_KAFKA_CONSUMER_PROPERTIES_MAX_PARTITION_FETCH_BYTES: This sets the maximum amount of data per partition that the consumer will fetch in a single request. The default is 5MB (5242880 bytes), which is suitable for most use cases.
    • KAFKA_CONSUMER_MAX_PARTITION_FETCH_BYTES: This is similar to the above and also defaults to 5MB (5242880 bytes).
  3. General Kafka Configuration:

    • MAX_MESSAGE_BYTES: This sets the maximum message size that Kafka topics can handle. The default is 5MB (5242880 bytes), but it can be adjusted based on your needs.

These configurations can be set in the values.yaml file of your Helm chart under the extraEnvs section for the respective components (e.g., datahub-gms, datahub-frontend, datahub-mae-consumer, datahub-mce-consumer).

Would you like more detailed steps on how to configure these settings in your Helm chart?

Sources:

Hi, how did my answer work out for you? Need any more assistance? You can also tag <@U07FV3UHE4D> to let your fellow community members know you’re still looking for guidance.

Hi :slight_smile: ! Did this interaction resolve your issue or do you still require assistance ?

Please tag @ RunLLM if you still need further assistance !

Or tag @Datahub Community Support if you require a person to take a look :slight_smile:

Hope this helps ! ><