This PR implements KIP-78:Cluster Identifiers [(link)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id#KIP-78:ClusterId-Overview) and includes the following changes:
1. Changes to broker code
- generate cluster id and store it in Zookeeper
- update protocol to add cluster id to metadata request and response
- add ClusterResourceListener interface, ClusterResource class and ClusterMetadataListeners utility class
- send ClusterResource events to the metric reporters
2. Changes to client code
- update Cluster and Metadata code to support cluster id
- update clients for sending ClusterResource events to interceptors, (de)serializers and metric reporters
3. Integration tests for interceptors, (de)serializers and metric reporters for clients and for protocol changes and metric reporters for broker.
4. System tests for upgrading from previous versions.
Author: Sumit Arrawatia <sumit.arrawatia@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1830 from arrawatia/kip-78
The `JsonConverter` class has `LogicalTypeConverter` implementations for Date, Time, Timestamp, and Decimal, but these implementations fail when the input literal value (deserialized from the message) is null.
Test cases were added to check for these cases, and these failed before the `LogicalTypeConverter` implementations were fixed to consider whether the schema has a default value or is optional, similarly to how the `JsonToConnectTypeConverter` implementations do this. Once the fixes were made, the new tests pass.
Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Shikhar Bhushan <shikhar@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#1867 from rhauch/kafka-4183
This is joint work between dguy and enothereska. The work implements KIP-63. Overview of main changes:
- New byte-based cache that acts as a buffer for any persistent store and for forwarding changes downstream.
- Forwarding record path changes: previously a record in a task completed end-to-end. Now it may be buffered in a processor node while other records complete in the task.
- Cleanup and state stores and decoupling of cache from state store and forwarding.
- More than 80 new unit and integration tests.
Author: Damian Guy <damian.guy@gmail.com>
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1752 from enothereska/KAFKA-3776-poc
This applies to Replication Quotas
based on KIP-73 [(link)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas) originally motivated by KAFKA-1464.
System Tests Run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/544/
**This first PR demonstrates the approach**.
**_Overview of Change_**
The guts of this change are relatively small. Throttling occurs on both leader and follower sides. A single class tracks the throttled throughput in and out of each broker (**_ReplicationQuotaManager_**).
On the follower side, the Follower Throttled Rate is calculated as fetch responses arrive. Then, before the next fetch request is sent, we check to see if the quota is violated, removing throttled partitions from the request if it is. This is all encapsulated in a few lines of code in the **_ReplicaFetcherThread_**. There is existing code to handle temporal back off, if the request ends up being empty.
On the leader side it's a little more complex. When a fetch request arrives in the leader, it is built, partition by partition, in **_ReplicaManager.readFromLocalLog_**. As we put each partition into the fetch response, we check if the total size fits in the current quota. If the quota is exceeded, the partition will not be added to the fetch response. Importantly, we don't increase the quota at this point, we just check to see if the bytes will fit.
Now, if there aren't enough bytes to send the response immediately, which is common if we're catching up and throttled, then the request will be put in purgatory. I've added some simple code to **_DelayedFetch_** to handle throttled partitions (throttled partitions are checked against the quota, rather than the messages available in the log).
When the delayed fetch completes, and exits purgatory, _**ReplicaManager.readFromLocalLog**_ will be called again. This is why _**ReplicaManager.readFromLocalLog**_ does not actually increase the quota, it just checks whether enough bytes are available for a partition.
Finally, when there are enough bytes to be sent, or the delayed fetch times out, the response will be sent. Before it is sent the throttled-outbound-rate is increased, based on the size of throttled partitions being sent. This is at the end of _**KafkaApis.handleFetchRequest**_, exactly where client quotas are recorded.
There is an acceptance test which asserts the whole throttling process stabilises on the desired value. This covers a number of use cases including many-to-many replication. See **_ReplicationQuotaTest_**.
Note:
It should be noted that this protocol can over-request. The request is built, based on the quota at time t1 (_ReplicaManager.readFromLocalLog_). The bytes in the response are recorded at time t2 (end of _KafkaApis.handleFetchRequest_), where t2 > t1. For this reason I originally included an OverRequestedRate as a JMX metric, but testing has not seen revealed any obvious issue. Over-requesting is quickly compensated by subsequent requests, stabilising close to the quota value.
_**Main stuff left to do:**_
- The fetch size is currently unbounded. This will be addressed in KIP-74, but we need to ensure this ensures requests don’t go beyond the throttle window.
- There are two failures showing up in the system tests on this branch: StreamsSmokeTest.test_streams (which looks like it fails regularly) and OffsetValidationTest.test_broker_rolling_bounce (which I need to look into)
_**Stuff left to do that could be deferred:**_
- Add the extra metrics specified in the KIP.
- There are no system tests.
- There is no validation for the cluster size / throttle combination that could lead to ISR dropouts
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>
Closes#1776 from benstopford/rep-quotas-v2
Fix for bug outlined in KAFKA-4131
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Damian Guy, Guozhang Wang
Closes#1843 from bbejeck/KAFKA-4131_mulitple_regex_consumers_cause_npe
Set the NUM_STREAM_THREADS_CONFIG = 1 in SmokeTestClient as we get locking issues when we have NUM_STREAM_THREADS_CONFIG > 1 and we have Standby Tasks, i.e., replicas. This is because the Standby Tasks can be assigned to the same KafkaStreams instance as the active task, hence the directory is locked
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#1861 from dguy/fix-smoketest
Followed the same naming pattern as the producer sender thread.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson
Closes#1854 from ijuma/heartbeat-thread-name
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Joel Koshy <jjkoshy.w@gmail.com>, Jiangjie Qin <becket.qin@gmail.com>
Closes#1851 from lindong28/KAFKA-4158
Here is the patch on github ijuma.
Acquiring the consumer lock (the single thread access controls) requires that the consumer be open. I changed the closed variable to be volatile so that another thread's writes will visible to the reading thread.
Additionally, there was an additional check if the consumer was closed after the lock was acquired. This check is no longer necessary.
This is my original work and I license it to the project under the project's open source license.
Author: Tim Brooks <tim@uncontended.net>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1637 from tbrooks8/KAFKA-2311
A couple of the tests may transiently fail in QueryableStateIntegrationTest as they are not catching InvalidStateStoreException. This exception is expected during rebalance.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#1840 from dguy/minor-fix
Now uses LogSegment.largestTimestamp to determine age of segment's messages.
Author: Eric Wasserman <eric.wasserman@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1794 from ewasserman/feat-1981
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Dan Norwood <norwood@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1821 from hachikuji/KAFKA-3807
This PR changes topic subscription semantics so a change in subscription does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update the assigned partitions.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson
Closes#1726 from vahidhashemian/KAFKA-4033
The verification in verifyGreaterOrEqual was incorrect. It was failing when a new key was found.
Set the TimeWindow to a large value so all windowed results fall in a single window
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1833 from dguy/minor-test-fix
changelogs of window stores now configure cleanup.policy=compact,delete with retention.ms set to window maintainMs + StreamsConfig.WINDOW_STORE_CHANGE_LOG_ADDITIONAL_RETENTION_MS_CONFIG
StoreChangeLogger produces messages with context.timestamp().
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#1792 from dguy/kafka-3595
Mark the store as open after the DB has been restored from the changelog.
Only add the store to the map in ProcessorStateManager post restore.
Make RocksDBWindowStore.Segment override openDB(..) as it needs to mark the Segment as open.
Throw InvalidStateStoreException if any stores in a KafkaStreams instance are not available.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#1824 from dguy/kafka-4123
Update documentation for Kafka Connect distributed’s config.storage.topic, offset.storage.topic, and status.storage.topic configuration values to indicate that all three should refer to compacted topics.
Author: Mathieu Fenniak <mathieu.fenniak@replicon.com>
Reviewers: Jason Gustafson
Closes#1832 from mfenniak/kafka-connect-topic-docs
change console producer default acks to 1, update acks docs. Also added the -1 config to the acks docs since that question comes up often. ijuma and vahidhashemian, does this look reasonable to you?
Author: Dustin Cote <dustin@confluent.io>
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1795 from cotedm/KAFKA-3129
- use AdminTool to check for active consumer group
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1767 from mjsax/kafka-4058-trunk
The commit here changes the log level of a log message from WARN to DEBUG.
As noted in the mail discussion here https://www.mail-archive.com/devkafka.apache.org/msg56035.html,
in a pretty straightforward/typical and valid setup, the broker logs get
flooded with the following message:
[2016-09-02 08:07:13,773] WARN SSL peer is not authenticated, returning ANONYMOUS instead (org.apache.kafka.common.network.SslTransportLayer)
Author: Jaikiran Pai <jaikiran.pai@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1825 from jaikiran/ssl-log-level
Also add tests and make `Crc32.update` perform the same argument checks as
`java.util.zip.CRC32`.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1672 from ijuma/record-is-valid-should-be-more-robust
Get channel remote address before calling ```channel.close```
Author: Tao Xiao <xiaotao183@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1826 from xiaotao183/KAFKA-4129
Typically this error condition is caused by topic-level configuration issues, so it is useful to include which topic partition was reset for operator use when debugging the root cause.
Author: Dana Powers <dana.powers@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1801 from dpkp/log_topic_partition_reset_dirty_offset
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#1809 from becketqin/KAFKA-4099
If the thread or process is not the coordinator the Cluster instance in StreamPartitionAssignor will always be null. This builds an instance of the Cluster with the metadata associated with the Assignment
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1804 from dguy/kafka-4104
Rephrase 'alpha quality' wording in Streams section of api.html.
Couple of other minor fixes in streams.html
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang, Ismael Juma, Michael G. Noll
Closes#1811 from dguy/kstreams-312
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1807 from hachikuji/KAFKA-4103
Modifies example in doc change
Author: ybyzek <ybyzek@users.noreply.github.com>
Reviewers: Guozhang Wang, Ismael Juma
Closes#1805 from ybyzek/onComplete_doc
set print-data-log option when offset-decoder is set. hachikuji we had talked about this one before, does this change look ok to you?
Author: Dustin Cote <dustin@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1797 from cotedm/KAFKA-4062
Invoke the statusListener.onFailure() callback on start failures so that the statusBackingStore is updated. This involved a fix to the putSafe() functionality which prevented any update that was not preceded by a (non-safe) put() from completing, so here when a connector or task is transitioning directly to FAILED.
Worker start methods can still throw if the same connector name or task ID is already registered with the worker, as this condition should not happen.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1778 from shikhar/distherder-stayup-take4
There's a minor bug in ProcessorTopologyTestDriver that prevents it from working with a topology that contains multiple sources. The bug is that ```consumer.assign()``` is called while looping through all the source topics, but, consumer.assign resets the state of the MockConsumer to only consume from the topics passed in. This patch fixes the issue by calling consumer.assign once with all the TopicPartition instances. Unit test (testDrivingSimpleMultiSourceTopology) included.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Mathieu Fenniak <mathieu.fenniak@replicon.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1782 from mfenniak/ProcessorTopologyTestDriver-multiple-source-bugfix
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1793 from ijuma/include-request-header-if-request-correlation-fails