Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>
Closes#4104 from mjsax/minor-uncaught-exception-handler
The methods resetReconnectBackoff and updateReconnectBackoff in ClusterConnectionStates both take an instance of a private inner class as parameter and thus cannot be called from outside the class anyway.
Author: Soenke Liebau <soenke.liebau@opencore.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4114 from soenkeliebau/MINOR_private
* Fix issue in `retryRequestsUntilConnected` where the same response
could appear multiple times (implies that we are lacking test coverage)
* Introduce type member in AsyncRequest for the AsyncResponse
type and refactor the code to eliminate most downcasts
* Remove a number of unnecessary collection copies in
`retryRequestsUntilConnected`
* Move ControllerContext to its own file
* Rename getACL/setACL to getAcl/setAcl to match Kafka naming
convention
* Replace tuple of 3 elements with case class in one place (we
should do this in other places too)
* Extract `send` and `shouldWatch` from
`ZooKeeperClient.handleRequests`
* Use pattern matching instead of if/else chains in a few places (we
should do it in more places)
* A couple of renames to avoid overloads and hence benefit from
better type inference
* Use Option and default arguments instead of passing null in
some places
* `Expired` is no longer a case class since it has no parameters,
but it has state
* Various minor clean-ups
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Onur Karaman <okaraman@linkedin.com>
Closes#4088 from ijuma/async-zkclient-cleanups
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4103 from rajinisivaram/KAFKA-6042-group-deadlock
(cherry picked from commit 5ee157126d595b913761cf1887963460bbe12855)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
The idempotent producer doesn't change that setting any more and the
accepted range has changed.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#4097 from ijuma/fix-javadoc-wrt-max-in-flight-for-idempotent
Author: Tommy Becker <tobecker@tivo.com>
Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>
Closes#4081 from twbecker/KAFKA-6069
Currently, in branches _trunk_, _0.11.0_, and _1.0_ the property **max.in.flight.requests.per.connection** is incorrectly misspelled as _max.inflight.requests.per.connection_
harshach ijuma guozhangwang can you please review. Thank you.
Author: Hugo Louro <hmclouro@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4094 from hmcl/trunk_MINOR_Doc_InflightProp
Rearranged the testAddPartitionDuringDeleteTopic() test to keep the
likelyhood of the race condition.
Author: Maytee Chinavanichkit <maytee.chinavanichkit@linecorp.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#4056 from mayt/KAFKA-6051
Kafka today uses ZkClient, a wrapper client around the raw Zookeeper client. This library only exposes synchronous apis to the user. Synchronous apis mean we must wait an entire round trip before doing the next operation.
This becomes problematic with partition-heavy clusters, as we find the controller spending a significant amount of time just sending many sequential reads and writes to zookeeper at the per-partition granularity. This especially becomes an issue during:
- controller failover, where the newly elected controller effectively reads all zookeeper state.
- broker failures and controlled shutdown. The controller tries to elect a new leader for partitions previously led by the broker. The controller also removes the broker from isr on partitions for which the broker was a follower. These all incur partition-granular reads and writes to zookeeper.
As a first step in addressing these issues, we built a low-level wrapper client called ZookeeperClient in KAFKA-5501 that encourages pipelined, asynchronous apis.
This patch converts the controller to use the async ZookeeperClient to improve controller failover, broker failure handling, and controlled shutdown times.
Some notable changes made in this patch:
- All ControllerEvents now defer access to zookeeper at processing time instead of enqueue time as was intended with the single-threaded event queue model patch from KAFKA-5028. This results in a fresh view of the zookeeper state by the time we process the event. This reverts the hacks from KAFKA-5502 and KAFKA-5879.
- We refactored PartitionStateMachine and ReplicaStateMachine to process multiple partitions and replicas in batch rather than one-at-a-time so that we can send a batch of requests over to ZookeeperClient to pipeline.
- We've decouple ZookeeperClient handler registration from watcher registration. Previously, these two were coupled, which meant handler registrations actually sent out a request to the zookeeper ensemble to do the actual watcher registration. In KafkaController.onControllerFailover, we register partition modification handlers (thereby registering watchers) and additionally lookup the partition assignments for every topic in the cluster. We can shave a bit of time off failover if we merge these two operations. We can do this by decoupling ZookeeperClient handler registration from watcher registration. This means ZookeeperClient's registration apis have been changed so that they are purely in-memory operations, and they only take effect when the client sends ExistsRequest, GetDataRequest, or GetChildrenRequest.
- We've simplified the logic for updating LeaderAndIsr such that if we get a BADVERSION error code, the controller will now just retry in the next round by reading the new state and trying the update again. This simplifies logic when updating the partition leader epoch, removing replicas from isr, and electing leaders for partitions.
- We've implemented KAFKA-5083: always leave the last surviving member of the ISR in ZK. This means that if people re-disabled unclean leader election, we can still try to elect the leader from the last in-sync replica.
- ZookeeperClient's handlers have been changed so that their methods default to no-ops for convenience.
- All znode paths and definitions for znode encoding and decoding have been consolidated as static methods in ZkData.scala.
- The partition leader election algorithms have been refactored as pure functions so that they can be easily unit tested.
- PartitionStateMachine and ReplicaStateMachine now have unit tests.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#3765 from onurkaraman/KAFKA-5642
long sizeBytes() {
long sizeInBytes = 0;
for (final NamedCache namedCache : caches.values()) {
sizeInBytes += namedCache.sizeInBytes();
}
return sizeInBytes;
}
The summation w.r.t. sizeInBytes may overflow.
Check similar to what is done in size() should be performed.
Author: siva santhalingam <siva.santhalingam@gmail.com>
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>
Closes#4041 from shivsantham/kafka-6023
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Tom Bentley <tbentley@redhat.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#3874 from lindong28/KAFKA-5163
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3814 from omkreddy/KAFKA-4504
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax<matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
Closes#4074 from dguy/minor-session-window-equals
1. Added missing Javadocs in public interfaces.
2. Added missing upgrade web docs.
3. Minor improvements on exception messages.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Antony Stubbs <antony.stubbs@gmail.com>
Closes#4071 from guozhangwang/KMinor-javadoc-gaps
Author: Jacek Laskowski <jacek@japila.pl>
Reviewers: Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#4038 from jaceklaskowski/KAFKA-4818-isolationLevel
- Remove "list commits" since we never use it
- Fix release branch detection to just look
for branches that start with digits
- Make script executable
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#4067 from ijuma/merge-script-improvements
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4064 from mjsax/minor-add-state-serdes-test
Multiple inflights means that when there are rolling bounces or other cluster instability, there is an increased likelihood of having previously tried batch expire in the accumulator. This is a fatal error
for a transactional producer, causing the `TransactionalMessageCopier` to exit. To work around this, we bump the request timeout. We can get rid of this when KIP-91 is merged.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#4039 from apurvam/MINOR-bump-request-timeout-in-transactional-message-copier
Author: Bill Bejeck <bill@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax<matthias@confluent.io>
Closes#4068 from bbejeck/MINOR_fix_java_doc_example_for_1_0_API
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>
Closes#4063 from mjsax/minor-improve-store-parameter-checks
The following happens on Windows for `HUP`:
[2017-10-11 21:45:11,642] FATAL (kafka.Kafka$)
java.lang.IllegalArgumentException: Unknown signal: HUP
at sun.misc.Signal.<init>(Unknown Source)
at kafka.Kafka$.registerHandler$1(Kafka.scala:67)
at kafka.Kafka$.registerLoggingSignalHandler(Kafka.scala:73)
at kafka.Kafka$.main(Kafka.scala:82)
at kafka.Kafka.main(Kafka.scala)
I thought it was safer not to register them at all since the additional
logging is a nice to have and we haven't tested it on Windows.
Also changed map to be concurrent and removed stray
printStackTrace in test.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Damian Guy <damian.guy@gmail.com>
Closes#4066 from ijuma/dont-register-signal-handler-windows
I found this by running the tests while I happened to
have a kafka broker running.
Author: Tom Bentley <tbentley@redhat.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4065 from tombentley/MINOR-random-port
Use Scala string templates instead of format
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4058 from mimaison/minor_AFT_logging
guozhangwang Please review.
Author: Manjula K <manjula@kafka-summit.org>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#4059 from manjuapu/redesign-streams-page
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#4054 from guozhangwang/KMinor-pre-1.0-release
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4046 from mjsax/kafka-5541-minor-follow-up
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4051 from mjsax/minor-kip-182-follow-up
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4035 from hachikuji/KAFKA-5547-followup and squashes the following commits:
f6b04ce1a [Jason Gustafson] Add a couple missed common fields
d3473b14d [Jason Gustafson] Fix compilation errors and a few warnings
58a0ae695 [Jason Gustafson] MINOR: Avoid some unnecessary collection copies in KafkaApis
With these changes, we are ensuring that the partitions being reassigned are from non-zero offsets. We also ensure that every message in the log has producerId and sequence number.
This means that it successfully reproduces https://issues.apache.org/jira/browse/KAFKA-6003.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#4029 from apurvam/KAFKA-6016-add-idempotent-producer-to-reassign-partitions
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4048 from mjsax/fix-eos-test-race-condition
Author: bartdevylder <bartdevylder@gmail.com>
Author: Bart De Vylder <bartdevylder@gmail.com>
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#4044 from bartdevylder/KAFKA-6026
Author: Xin Li <Xin.Li@trivago.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#4043 from lisa2lisa/fix
(cherry picked from commit bb27215cea)
Signed-off-by: Jason Gustafson <jason@confluent.io>
Author: Jason Gustafson <jason@confluent.io>
Reviewers: tedyu <yuzhihong@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4047 from hachikuji/factor-out-some-common-fields
…up rebalances in progress
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3506 from cmccabe/KAFKA-5565
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4037 from mjsax/kafka-5541-dont-rethrow-on-suspend-or-close-2