@ -464,7 +464,7 @@ situations where the upstream data source would not otherwise be replayable.
@@ -464,7 +464,7 @@ situations where the upstream data source would not otherwise be replayable.
Here is a high-level picture that shows the logical structure of a Kafka log with the offset for each message.
The head of the log is identical to a traditional Kafka log. It has dense, sequential offsets and retains all messages. Log compaction adds an option for handling the tail of the log. The picture above shows a log
with a compacted tail. Note that the messages in the tail of the log retain the original offset assigned when they were first written—that never changes. Note also that all offsets remain valid positions in
@ -478,7 +478,7 @@ marked as the "delete retention point" in the above diagram.
@@ -478,7 +478,7 @@ marked as the "delete retention point" in the above diagram.
The compaction is done in the background by periodically recopying log segments. Cleaning does not block reads and can be throttled to use no more than a configurable amount of I/O throughput to avoid impacting
producers and consumers. The actual process of compacting a log segment looks something like this:
@ -201,7 +201,7 @@ value : V bytes
@@ -201,7 +201,7 @@ value : V bytes
<p>
The use of the message offset as the message id is unusual. Our original idea was to use a GUID generated by the producer, and maintain a mapping from GUID to offset on each broker. But since a consumer must maintain an ID for each server, the global uniqueness of the GUID provides no value. Furthermore, the complexity of maintaining the mapping from a random id to an offset requires a heavy weight index structure which must be synchronized with disk, essentially requiring a full persistent random-access data structure. Thus to simplify the lookup structure we decided to use a simple per-partition atomic counter which could be coupled with the partition id and node id to uniquely identify a message; this makes the lookup structure simpler, though multiple seeks per consumer request are still likely. However once we settled on a counter, the jump to directly using the offset seemed natural—both after all are monotonically increasing integers unique to a partition. Since the offset is hidden from the consumer API this decision is ultimately an implementation detail and we went with the more efficient approach.
The log allows serial appends which always go to the last file. This file is rolled over to a fresh file when it reaches a configurable size (say 1GB). The log takes two configuration parameters: <i>M</i>, which gives the number of messages to write before forcing the OS to flush the file to disk, and <i>S</i>, which gives a number of seconds after which a flush is forced. This gives a durability guarantee of losing at most <i>M</i> messages or <i>S</i> seconds of data in the event of a system crash.
@ -51,7 +51,7 @@ In Kafka the communication between the clients and the servers is done with a si
@@ -51,7 +51,7 @@ In Kafka the communication between the clients and the servers is done with a si
<p>Let's first dive into the core abstraction Kafka provides for a stream of records—the topic.</p>
<p>A topic is a category or feed name to which records are published. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.</p>
<p> For each topic, the Kafka cluster maintains a partitioned log that looks like this: </p>
<imgsrc="images/log_anatomy.png">
<imgclass="centered"src="images/log_anatomy.png">
<p> Each partition is an ordered, immutable sequence of records that is continually appended to—a structured commit log. The records in the partitions are each assigned a sequential id number called the <i>offset</i> that uniquely identifies each record within the partition.