Browse Source

KAFKA-15418: update Kafka design docs with decompression information (#14322)

Reviewers: Divij Vaidya <diviv@amazon.com>, Matthias J. Sax <mjsax@apache.org>

---------

Co-authored-by: Cerchie <lcerchie@confluent.io>
pull/14358/head
Lucia Cerchie 1 year ago committed by GitHub
parent
commit
17862ffaf2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 5
      docs/design.html

5
docs/design.html

@ -136,8 +136,9 @@ @@ -136,8 +136,9 @@
the user can always compress its messages one at a time without any support needed from Kafka, but this can lead to very poor compression ratios as much of the redundancy is due to repetition between messages of
the same type (e.g. field names in JSON or user agents in web logs or common string values). Efficient compression requires compressing multiple messages together rather than compressing each message individually.
<p>
Kafka supports this with an efficient batching format. A batch of messages can be clumped together compressed and sent to the server in this form. This batch of messages will be written in compressed form and will
remain compressed in the log and will only be decompressed by the consumer.
Kafka supports this with an efficient batching format. A batch of messages can be grouped together, compressed, and sent to the server in this form. The broker decompresses the batch in order to validate it. For
example, it validates that the number of records in the batch is same as what batch header states. This batch of messages is then written to disk in compressed form. The batch will remain compressed in the log and it will also be transmitted to the
consumer in compressed form. The consumer decompresses any compressed data that it receives.
<p>
Kafka supports GZIP, Snappy, LZ4 and ZStandard compression protocols. More details on compression can be found <a href="https://cwiki.apache.org/confluence/display/KAFKA/Compression">here</a>.

Loading…
Cancel
Save