It’s a simple matter of creating the internal topic before trying to send
to it. Otherwise, we could get an `UnknownTopicOrPartitionException`
in some cases.
Without the change, I could reproduce a failure in less than
5 runs. With the change, 30 consecutive runs succeeded.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Apurva Mehta <apurva.1618@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#2584 from ijuma/test-cannot-send-to-internal-topic-transient-failure
It contained this step:
val canShutdown = isShuttingDown.compareAndSet(false, true)
if (canShutdown && shutdownLatch.getCount > 0) {
without any fallback for the case of `shutdownLatch.getCount == 0`. So in the case
of `shutdownLatch.getCount == 0` (when a previous call to the shutdown method
was right about to finish) you would set `isShuttingDown` to true again without any
possibility of ever getting the server started (since `startup` will check
`isShuttingDown` before setting up a new latch with count 1).
Long story short: concurrent calls to shutdown can get the server locked in a broken state.
This fixes the reported error:
java.lang.IllegalStateException: Kafka server is still shutting down, cannot re-start!
at kafka.server.KafkaServer.startup(KafkaServer.scala:184)
at kafka.integration.KafkaServerTestHarness$$anonfun$restartDeadBrokers$2.apply$mcVI$sp(KafkaServerTestHarness.scala:117)
at kafka.integration.KafkaServerTestHarness$$anonfun$restartDeadBrokers$2.apply(KafkaServerTestHarness.scala:116)
at kafka.integration.KafkaServerTestHarness$$anonfun$restartDeadBrokers$2.apply(KafkaServerTestHarness.scala:116)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.immutable.Range.foreach(Range.scala:160)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at kafka.integration.KafkaServerTestHarness$class.restartDeadBrokers(KafkaServerTestHarness.scala:116)
at kafka.api.ConsumerBounceTest.restartDeadBrokers(ConsumerBounceTest.scala:34)
at kafka.api.ConsumerBounceTest$BounceBrokerScheduler.doWork(ConsumerBounceTest.scala:158)
Author: Armin Braun <me@obrown.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2568 from original-brownbear/KAFKA-4198
The intent is good, but it needs to take into account broker configs as well.
See KAFKA-4788 for more details.
This reverts commit 4ca5abe8ee7578f602fb7653cb8a09640607ea85.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2588 from ijuma/kafka-4788
Also move `requireTimestamp` to `minVersion` logic from `Fetcher` to
`ListOffsetRequest.Builder.forConsumer()`.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2580 from ijuma/move-proto-utils-to-api-keys
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2489 from cmccabe/KAFKA-4708
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2543 from hachikuji/remove-message-writer
Fixed ClassCastException resulting from missing type hint in request logging.
Author: Armin Braun <me@obrown.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2571 from original-brownbear/fix-logging-err-response
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Ewen Cheslack-Postava <me@ewencp.org>
Closes#2566 from hachikuji/hotfix-request-logging
More details:
* Replaced `struct` field in Request/Response with a `toStruct` method. This
makes the performance model (including memory usage) easier to understand.
Note that requests have `toStruct()` while responses have `toStruct(version)`.
* Replaced mutable `version` field in `Request.Builder` with an immutable
field `desiredVersion` and a `version` parameter passed to the `build` method.
* Optimised `handleFetchRequest` to avoid unnecessary creation of `Struct`
instances (from 4 to 2 in the worst case and 2 to 1 in the best case).
* Various clean-ups in request/response classes and their test. In particular,
it is now clear what we are testing. Previously, it looked like we were testing
more than we really were.
With this in place, we could remove `AbstractRequest.Builder` in the future by
doing the following:
* Change `AbstractRequest.toStruct` to accept a version (like responses).
* Change `AbstractRequest.version` to be `desiredVersion` (like `Builder`).
* Change `ClientRequest` to take `AbstractRequest`.
* Move validation from the `build` methods to the request constructors or
static factory methods.
* Anything else required for the code to compile again.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Apurva Mehta <apurva.1618@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#2513 from ijuma/separate-struct
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2544 from becketqin/KAFKA-4340_follow_up
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva.1618@gmail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#2545 from hachikuji/KAFKA-4761
…porter.configure
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#2540 from cmccabe/KAFKA-4756
`ConsumerGroup` should be `Group`.
Author: Dustin Cote <dustin@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2534 from cotedm/acl-znode-correction
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2475 from vahidhashemian/minor/use_explicit_Errors_type_when_possible
Added general explanation of the tool and what it does. Also added few details to the arguments.
Author: Gwen Shapira <cshapi@gmail.com>
Reviewers: Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#2503 from gwenshap/KAFKA-4733
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#2501 from becketqin/KAFKA-4734
In https://issues.apache.org/jira/browse/KAFKA-4521 we fixed a potential message reorder bug in MM. However, the patch introduced another bug that can cause deadlock during MM shutdown. The deadlock will happen if zookeeper listener thread call requestAndWaitForCommit() after MirrorMaker thread has already exited loop of consuming and producing messages.
This patch fixes the problem by setting `iter` to `null` in `MirrorMakerOldConsumer.cleanup()`. If zookeeper listener thread calls `requestAndWaitForCommit()` after `cleanup()`, then it will not block waiting for commit notification since `iter == null`. If zookeeper listener thread calls `requestAndWaitForCommit()` before `cleanup()`, then `cleanup()` will call `notifyAll()` to unblock zookeeper listener thread.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>
Closes#2504 from lindong28/KAFKA-4735
OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
topic being deleted
Added integration test which polls the metrics while topics are being
created and deleted
Developed with mimaison
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#2325 from edoardocomar/KAFKA-4441
With this change, the consumer will be considered initialized in the
ProduceConsumeValidate tests once its partitions have been assigned.
Author: Apurva Mehta <apurva.1618@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2347 from apurvam/KAFKA-4588-fix-race-between-producer-consumer-start
Author: Maysam Yabandeh <myabandeh@dropbox.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2474 from ijuma/kafka-4039-deadlock-during-shutdown
Kafka brokers have a config called "offsets.topic.replication.factor" that specify the replication factor for the "__consumer_offsets" topic. The problem is that this config isn't being enforced. If an attempt to create the internal topic is made when there are fewer brokers than "offsets.topic.replication.factor", the topic ends up getting created anyway with the current number of live brokers. The current behavior is pretty surprising when you have clients or tooling running as the cluster is getting setup. Even if your cluster ends up being huge, you'll find out much later that __consumer_offsets was setup with no replication.
The cluster not meeting the "offsets.topic.replication.factor" requirement on the internal topic is another way of saying the cluster isn't fully setup yet.
The right behavior should be for "offsets.topic.replication.factor" to be enforced. Topic creation of the internal topic should fail with GROUP_COORDINATOR_NOT_AVAILABLE until the "offsets.topic.replication.factor" requirement is met. This closely resembles the behavior of regular topic creation when the requested replication factor exceeds the current size of the cluster, as the request fails with error INVALID_REPLICATION_FACTOR.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2177 from onurkaraman/KAFKA-3959
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2455 from hachikuji/KAFKA-4704
In some Places for the loop was used but it can be replaced by the for each.
In One file if else if else was used so I replaced the same with match.
Author: Akash Sethi <akash.sethi@knoldus.in>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2435 from akashsethi24/trunk
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2443 from ijuma/close-create-topics-policy-during-shutdown
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#2406 from ijuma/kafka-4636-per-listener-security-settings
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2436 from hachikuji/hotfix-offset-deletion
This behaviour was changed in 8b3c6c0, but it caused interceptor
test failures (which rely on callbacks) and since we’re so close to
code freeze, it’s better to be conservative.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2440 from ijuma/kafka-4699-callbacks-invoked-before-future-is-completed
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2390 from cmccabe/KAFKA-4630
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2416 from hachikuji/refactor-partition-lag-cleanup
There is a slight change of behaviour: we now complete the `Future` returned from `send`
before the callbacks are invoked. This seems OK and perhaps a little better as the `Future`
can make progress sooner (as it would typically be blocked on a different thread than the
I/O thread that invokes the callbacks).
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2318 from ijuma/kafka-4597-record-metadata-log-append-time
Author: Manikumar Reddy O <manikumar.reddy@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#1850 from omkreddy/KAFKA-2700-DELETE
Fixes a logic error in the Reassignment process which throws an exception
if you don't rebalance all partitions.
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2399 from benstopford/KAFKA-4596
We now throw the correct TopicExistsException instead.
Author: Andrew Olson <aolson1@cerner.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2425 from noslowerdna/KAFKA-4687
Author: Magnus Reftel <magnus.reftel@skatteetaten.no>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2151 from reftel/feature/authorizer_name_reference
This is a KIP-104/105 follow-up. Thanks to ijuma for pointing out.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2350 from enothereska/minor-broker-level-config
Issue: https://issues.apache.org/jira/browse/KAFKA-4614
Fixes the problem that the broker threads suffered by long GC pause.
When GC thread collects mmap objects which were created for index files, it unmaps memory mapping so kernel turns to delete a file physically. This work may transparently read file's metadata from physical disk if it's not available on cache.
This seems to happen typically when we're using G1GC, due to it's strategy to left a garbage for a long time if other objects in the same region are still alive.
See the link for the details.
Author: Yuto Kawamura <kawamuray.dadada@gmail.com>
Reviewers: Apurva Mehta <apurva.1618@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>,
Closes#2352 from kawamuray/KAFKA-4614-force-munmap-for-index
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Apurva Mehta <apurva.1618@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2184 from cmccabe/KAFKA-4457
Remove workaround for testing multiple SASL mechanisms using
sasl.jaas.config and the new support for multiple client
modules within a JVM.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2373 from rajinisivaram/KAFKA-4568
1. Added javadoc to public classes
2. Removed `s` from config name for consistency with interface name
3. The policy interface now implements Configurable and AutoCloseable as per the KIP
4. Use `null` instead of `-1` in `RequestMetadata`
5. Perform all broker validation before invoking the policy
6. Add tests
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2388 from ijuma/create-topic-policy-docs-and-config-name-change
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2383 from hachikuji/minor-cleanup-kip-88
Fixes KAFKA-3857
Changes proposed in this pull request:
An additional log cleaner metric has been added:
time-since-last-run-ms: Time since the last log cleaner run, in milliseconds. This metric would be reset to 0 every time log cleaner thread runs. If this metric keeps constantly increasing, it indicates that the log cleaner thread is not alive.
If you are creating alerts around log cleaner, you could monitor this metric. A high "time-since-last-run-ms" value (eg: 600000) indicates that the log cleaner hasn't been running since the last 10 minutes.
The code has been tested. JMX metric has been verified.
Note: This pull request is a continuation of the following pull request. PR#1593 was quite old and I had some trouble rebasing it. Decided to start a fresh PR.
927b28cf41 (diff-ca1c127eee4b3c748ae73028f6abeab8)
Author: Kiran Pillarisetty <pillarisetty@tivo.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2378 from kiranptivo/log_cleaner_jmx_metric
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2074 from vahidhashemian/KAFKA-3853