Tested with DeleteRecordsRequestTest by Tom Bentley, which is part of
a separate pull request.
Author: tedyu <yuzhihong@gmail.com>
Reviewers: Tom Bentley <tbentley@redhat.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4052 from tedyu/trunk
With the new consumer the "/consumers" path on Zookeeper isn't filled by consumer info. On closing the new consumer, there is some code that is useless to execute for trying to connect to Zookeeper (but the URL is null).
Author: ppatierno <ppatierno@live.com>
Author: Paolo Patierno <ppatierno@live.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3301 from ppatierno/old-consumer-delete-groupid
This is the PR related to the [KIP-204](https://cwiki.apache.org/confluence/display/KAFKA/KIP-204+%3A+Adding+records+deletion+operation+to+the+new+Admin+Client+API) in order to add the `deleteRecords` operation to the new Admin Client (it's already available in the "legacy" one).
Other than that, unit test and integration tests are added as well (such integration tests come from the "legacy" integration tests in order to test the new addition in the same way as the "legacy" one).
Author: Paolo Patierno <ppatierno@live.com>
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4132 from ppatierno/kafka-5925
rather than using GetDataRequest
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4217 from mimaison/pathExists
Added unit test for ApiVersion and testApiVersions from
Scala to Java.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4220 from ijuma/kafka-6210-iae-if-1.0.0-inter-broker-protocol-version
When reviewing https://github.com/apache/kafka/pull/4132, I felt that NOT_ENOUGH_REPLICAS should never happen actually. Hence proposing to remove it from the listed error code as well in the broker-side capture clause.
Testing added in 4132 should have been sufficient.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Dong Lin <lindong28@gmail.com>
Closes#4208 from guozhangwang/KMinor-delete-records-error-code
Also use ZkVersion.NoVersion instead of -1.
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4196 from mimaison/zkclient_refactor
Also rename UtilsTest to CoreUtilsTest and note
that `getOrElseUpdate` has the right behaviour
in Scala 2.12.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4142 from ijuma/use-scala-futures-in-core-utils-test
This patch also adds the a test for test the log deletion after close.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4179 from lindong28/KAFKA-6175
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4198 from onurkaraman/make-controller-helper-methods-private
This memory leak could eventually lead to an OutOfMemoryError. This
was particularly likely in case of down conversions as the leaked
channels would hold on to the record batch (which is only loaded
into the heap in case of down conversions).
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4193 from rajinisivaram/KAFKA-6185-oom
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4181 from rajinisivaram/KAFKA-6164
We currently enqueue a PreferredReplicaLeaderElection controller event in PreferredReplicaElectionHandler's handleCreation, handleDeletion, and handleDataChange. We can just enqueue the event upon znode creation and after preferred replica leader election completes. The processing of this latter enqueue will register the exist watch on PreferredReplicaElectionZNode and perform any pending preferred replica leader election that may have occurred between completion and registration.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4189 from onurkaraman/KAFKA-6146
- Eliminated all the unnecessary allocations of `TopicPartition` and
`TopicAndPartition` in the Controller. We now use the former
in the Controller (bringing it inline with the rest of the non legacy
code).
- Fixed missed `Listener` -> `Handler` renames for companion
objects.
- More String.format -> String interpolation conversions (the former
is roughly 5 times more expensive).
- Some other minor clean-ups.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Onur Karaman <okaraman@linkedin.com>, Viktor Somogyi <viktorsomogyi@gmail.com>
Closes#4152 from ijuma/controller-topic-partition-and-other-clean-ups
ZooKeeperClient is a zookeeper client that encourages pipelined requests to zookeeper. We want to add the notion of max inflight requests to the client for several reasons:
1. to bound memory overhead associated with async requests on the client.
2. to not overwhelm the zookeeper ensemble with a burst of requests.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ted Yu <yuzhihong@gmail.com>, Jun Rao <junrao@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#3860 from onurkaraman/KAFKA-5894
We only need to generate the event when the znode is created or deleted.
In the former case, we start the reassignment while in the latter we
re-register the watcher (necessary for the Controller to detect future
reassignments).
During Controller failover, we restart the reassignment without generating
an event so it's not affected by this change.
Also use the Controller cache (`ControllerContext.partitionsBeingReassigned`)
in `removePartitionFromReassignedPartitions` instead of reloading the
data from ZooKeeper.
Overall, we would previously load the reassignment data from ZooKeeper twice
per completed partition whereas now as don't do it at all. As an example,
say there were 30k partitions being reassigned, these changes save the
allocation of 900 million `TopicAndPartition` and `Seq[Int]` (replicas)
instances (could easily amount to 20-40 GB depending on the topic name
length). This matters most in cases where the partitions being reassigned
don't have much data allowing the reassignment to complete reasonably
fast for many of the partitions.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Onur Karaman <okaraman@linkedin.com>
Closes#4143 from ijuma/partition-reassignment-ignore-handle-deletion-and-data-change
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jiangjie Qin <becket.qin@gmail.com>
Closes#4177 from lindong28/KAFKA-6172
Windows directory paths often contain colons which are not allowed in
yammer metrics. Metric tag values with special characters must be
quoted.
Author: huxihx <huxi_2b@hotmail.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4173 from huxihx/KAFKA-6156
This is a followup to #4137
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#4146 from apurvam/MINOR-followups-to-bump-epoch-on-expire-patch
Author: Richard Yu <richardyu@Richards-Air.attlocal.net>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#4110 from ConcurrencyPractitioner/trunk
I kept zkUtils for the call to AdminUtils.createTopic(). AdminUtils can be done in another PR.
Is there a reason why we use TopicAndPartition instead of TopicPartition in KafkaControllerZkUtils ?
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#4111 from mimaison/KAFKA-6073
Failure to close the producer could cause a transient failure, more details
below.
The request timeout was only 2 seconds, exceptions thrown were not
propagated and the producer would not be closed. If the exception
was thrown during `send`, we did not increment `numMessages`
allowing the test to pass.
I have increased the timeout to 10 seconds and made sure that
exceptions are propagated.
Example of the error:
```text
kafka.api.SaslSslAdminClientIntegrationTest > classMethod STARTED
kafka.api.SaslSslAdminClientIntegrationTest > classMethod FAILED
java.lang.AssertionError: Found unexpected threads, allThreads=Set(metrics-meter-tick-thread-2, Signal Dispatcher, main, Reference Handler, scala-execution-context-global-164, kafka-producer-network-thread | producer-1, scala-execution-context-global-166, Test worker, scala-execution-context-global-1249, /0:0:0:0:0:0:0:1:58910 to /0:0:0:0:0:0:0:1:43025 workers Thread 2, Finalizer, /0:0:0:0:0:0:0:1:58910 to /0:0:0:0:0:0:0:1:43025 workers Thread 3, scala-execution-context-global-163, metrics-meter-tick-thread-1)
```
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4144 from ijuma/ensure-producer-is-closed-test-alter-replica-log-dirs
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Tom Bentley <tbentley@redhat.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3951 from omkreddy/KIP-103-DOCS
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4139 from vahidhashemian/minor/indentation_fix_1710
A description of the problem is in the JIRA. I have added an integration test which reproduces the original scenario, and also added unit test cases.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#4137 from apurvam/KAFKA-6119-bump-epoch-when-expiring-transactions
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4140 from rajinisivaram/KAFKA-6131-txn-concurrentmap
Even though this class is internal, it's widely
used by other projects and it's better to avoid
breaking them until we have a publicly supported
test library.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4138 from ijuma/revert-embedded-zookeeper-rename
- kafka.controller.ZookeeperClient -> kafka.zookeeper.ZooKeeperClient
- kafka.controller.ControllerZkUtils -> kafka.zk.KafkaZkClient
- kafka.controller.ZkData -> kafka.zk.ZkData
- Renamed various fields to match new names and for consistency
- A few clean-ups in ZkData
- Document intent
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Onur Karaman <okaraman@linkedin.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#4112 from ijuma/rename-zookeeper-client-and-move-to-zookeper-package
* Fix issue in `retryRequestsUntilConnected` where the same response
could appear multiple times (implies that we are lacking test coverage)
* Introduce type member in AsyncRequest for the AsyncResponse
type and refactor the code to eliminate most downcasts
* Remove a number of unnecessary collection copies in
`retryRequestsUntilConnected`
* Move ControllerContext to its own file
* Rename getACL/setACL to getAcl/setAcl to match Kafka naming
convention
* Replace tuple of 3 elements with case class in one place (we
should do this in other places too)
* Extract `send` and `shouldWatch` from
`ZooKeeperClient.handleRequests`
* Use pattern matching instead of if/else chains in a few places (we
should do it in more places)
* A couple of renames to avoid overloads and hence benefit from
better type inference
* Use Option and default arguments instead of passing null in
some places
* `Expired` is no longer a case class since it has no parameters,
but it has state
* Various minor clean-ups
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Onur Karaman <okaraman@linkedin.com>
Closes#4088 from ijuma/async-zkclient-cleanups
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4103 from rajinisivaram/KAFKA-6042-group-deadlock
(cherry picked from commit 5ee157126d)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
Rearranged the testAddPartitionDuringDeleteTopic() test to keep the
likelyhood of the race condition.
Author: Maytee Chinavanichkit <maytee.chinavanichkit@linecorp.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#4056 from mayt/KAFKA-6051
Kafka today uses ZkClient, a wrapper client around the raw Zookeeper client. This library only exposes synchronous apis to the user. Synchronous apis mean we must wait an entire round trip before doing the next operation.
This becomes problematic with partition-heavy clusters, as we find the controller spending a significant amount of time just sending many sequential reads and writes to zookeeper at the per-partition granularity. This especially becomes an issue during:
- controller failover, where the newly elected controller effectively reads all zookeeper state.
- broker failures and controlled shutdown. The controller tries to elect a new leader for partitions previously led by the broker. The controller also removes the broker from isr on partitions for which the broker was a follower. These all incur partition-granular reads and writes to zookeeper.
As a first step in addressing these issues, we built a low-level wrapper client called ZookeeperClient in KAFKA-5501 that encourages pipelined, asynchronous apis.
This patch converts the controller to use the async ZookeeperClient to improve controller failover, broker failure handling, and controlled shutdown times.
Some notable changes made in this patch:
- All ControllerEvents now defer access to zookeeper at processing time instead of enqueue time as was intended with the single-threaded event queue model patch from KAFKA-5028. This results in a fresh view of the zookeeper state by the time we process the event. This reverts the hacks from KAFKA-5502 and KAFKA-5879.
- We refactored PartitionStateMachine and ReplicaStateMachine to process multiple partitions and replicas in batch rather than one-at-a-time so that we can send a batch of requests over to ZookeeperClient to pipeline.
- We've decouple ZookeeperClient handler registration from watcher registration. Previously, these two were coupled, which meant handler registrations actually sent out a request to the zookeeper ensemble to do the actual watcher registration. In KafkaController.onControllerFailover, we register partition modification handlers (thereby registering watchers) and additionally lookup the partition assignments for every topic in the cluster. We can shave a bit of time off failover if we merge these two operations. We can do this by decoupling ZookeeperClient handler registration from watcher registration. This means ZookeeperClient's registration apis have been changed so that they are purely in-memory operations, and they only take effect when the client sends ExistsRequest, GetDataRequest, or GetChildrenRequest.
- We've simplified the logic for updating LeaderAndIsr such that if we get a BADVERSION error code, the controller will now just retry in the next round by reading the new state and trying the update again. This simplifies logic when updating the partition leader epoch, removing replicas from isr, and electing leaders for partitions.
- We've implemented KAFKA-5083: always leave the last surviving member of the ISR in ZK. This means that if people re-disabled unclean leader election, we can still try to elect the leader from the last in-sync replica.
- ZookeeperClient's handlers have been changed so that their methods default to no-ops for convenience.
- All znode paths and definitions for znode encoding and decoding have been consolidated as static methods in ZkData.scala.
- The partition leader election algorithms have been refactored as pure functions so that they can be easily unit tested.
- PartitionStateMachine and ReplicaStateMachine now have unit tests.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#3765 from onurkaraman/KAFKA-5642
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Tom Bentley <tbentley@redhat.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#3874 from lindong28/KAFKA-5163
The following happens on Windows for `HUP`:
[2017-10-11 21:45:11,642] FATAL (kafka.Kafka$)
java.lang.IllegalArgumentException: Unknown signal: HUP
at sun.misc.Signal.<init>(Unknown Source)
at kafka.Kafka$.registerHandler$1(Kafka.scala:67)
at kafka.Kafka$.registerLoggingSignalHandler(Kafka.scala:73)
at kafka.Kafka$.main(Kafka.scala:82)
at kafka.Kafka.main(Kafka.scala)
I thought it was safer not to register them at all since the additional
logging is a nice to have and we haven't tested it on Windows.
Also changed map to be concurrent and removed stray
printStackTrace in test.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Damian Guy <damian.guy@gmail.com>
Closes#4066 from ijuma/dont-register-signal-handler-windows
Use Scala string templates instead of format
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4058 from mimaison/minor_AFT_logging
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#4035 from hachikuji/KAFKA-5547-followup and squashes the following commits:
f6b04ce1a [Jason Gustafson] Add a couple missed common fields
d3473b14d [Jason Gustafson] Fix compilation errors and a few warnings
58a0ae695 [Jason Gustafson] MINOR: Avoid some unnecessary collection copies in KafkaApis
Author: Xin Li <Xin.Li@trivago.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#4043 from lisa2lisa/fix
(cherry picked from commit bb27215cea)
Signed-off-by: Jason Gustafson <jason@confluent.io>
…up rebalances in progress
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3506 from cmccabe/KAFKA-5565
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#4023 from ijuma/kafka-5829-avoid-reading-older-segments-on-hard-shutdown
It is possible for batches with sequence numbers to be in the `deque` while at the same time the in flight batches in the `TransactionManager` are removed due to a producerId reset.
In this case, when the batches in the `deque` are drained, we will get a `NullPointerException` in the background thread due to this line:
```java
if (first.hasSequence() && first.baseSequence() != transactionManager.nextBatchBySequence(first.topicPartition).baseSequence())
```
Particularly, `transactionManager.nextBatchBySequence` will return null, because there no inflight batches being tracked.
In this patch, we simply allow the batches in the `deque` to be drained if there are no in flight batches being tracked in the TransactionManager. If they succeed, well and good. If the responses come back with an error, the batces will be ultimately failed in the producer with an `OutOfOrderSequenceException` when the response comes back.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#4022 from apurvam/KAFKA-6015-npe-in-record-accumulator