This makes the case where we build the records from scratch consistent
with the case where update the batch header "in place". Thanks to
edenhill who found the issue while testing librdkafka.
The reason our tests don’t catch this is that we rely on the maxTimestamp
to compute the record level timestamps if log append time is used.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3177 from ijuma/set-base-sequence-for-log-append-time
Without it, it's possible that the assertion is checked before the exception
is thrown in the callback.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3182 from ijuma/fix-controller-failover-flakiness
Also introduce TopicConfig.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3120 from cmccabe/KAFKA-5265
This should be less flaky as it has a higher timeout. I also increased the timeout
in a couple of other tests that had a very low (100 ms) timeouts.
The failure would manifest itself as:
```text
java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
at kafka.api.test.ProducerCompressionTest.testCompression(ProducerCompressionTest.scala:97)
```
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3178 from ijuma/producer-compression-test-flaky
It sometimes fails in Jenkins like:
```text
java.lang.AssertionError: IllegalStateException was not thrown
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at kafka.controller.ControllerFailoverTest.testHandleIllegalStateException(ControllerFailoverTest.scala:86)
```
I ran it locally 100 times with no failure.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3176 from ijuma/improve-controller-failover-assert
Return UNSUPPORTED_MESSAGE_FORMAT in handleWriteTxnMarkers when a topic is not the correct message format.
Remove any TopicPartitions that have same error from those waiting for markers
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3152 from dguy/kafka-5308
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3161 from hachikuji/KAFKA-5251
Also update the test to be simpler since we can use a mock event to simulate the issue
more easily (thanks Jun for the suggestion). This should fix two issues:
1. A transient test failure due to a NPE in ControllerFailoverTest.testMetadataUpdate:
```text
Caused by: java.lang.NullPointerException
at kafka.controller.ControllerBrokerRequestBatch.addUpdateMetadataRequestForBrokers(ControllerChannelManager.scala:338)
at kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:975)
at kafka.controller.ControllerFailoverTest.testMetadataUpdate(ControllerFailoverTest.scala:141)
```
The test was creating an additional thread and it does not seem like it was doing the
appropriate synchronization (perhaps this became more of an issue after we changed
the Controller to be single-threaded and changed the locking)
2. Setting `activeControllerId.set(-1)` in `triggerControllerMove` causes `Reelect` not to invoke `onControllerResignation`. Among other things, this causes an `IllegalStateException` to be thrown when `KafkaScheduler.startup` is invoked for the second time without the corresponding `shutdown`. We now simply call `onControllerResignation` as part of `triggerControllerMove`.
Finally, I included a few clean-ups:
1. No longer update the broker state in `onControllerFailover`. This is no longer needed
since we removed the `RunningAsController` state (KAFKA-3761).
2. Trivial clean-ups in KafkaController
3. Removed unused parameter in `ZkUtils.getPartitionLeaderAndIsrForTopics`
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2935 from ijuma/on-controller-resignation-if-trigger-controller-move
Here is the sketch of this proposal:
1. When it is time to send the txn markers, only look for the leader node of the partition once instead of retrying, and if that information is not available, it means the partition is highly likely been removed since it was in the cache before. In this case, we just remove the partition from the metadata object and skip putting into the corresponding queue, and if all partitions' leader broker are non-available, complete this delayed operation to proceed to write the complete txn log entry.
2. If the leader id is unknown from the cache but the corresponding node object with the listener name is not available, it means that the leader is likely unavailable right now. Put it into a separate queue and let sender thread retry fetching its metadata again each time upon draining the queue.
One caveat of this approach is the delete-and-recreate case, and the argument is that since all the messages are deleted anyways when deleting the topic-partition, it does not matter whether the markers are on the log partitions or not.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Apurva Mehta <apurva@confluent.io>, Damian Guy <damian.guy@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3130 from guozhangwang/K5202-handle-topic-deletion
KAFKA-4603 the command parsed error
Using "new OptionParser" might result in parse error
Change all the OptionParser constructor in Kafka into "new OptionParser(false)"
Author: xinlihua <xin.lihua1@zte.com.cn>
Author: unknown <00067310@A23338408.zte.intra>
Author: auroraxlh <xin.lihua1@zte.com.cn>
Author: xin <xin.lihua1@zte.com.cn>
Reviewers: Damian Guy, Guozhang Wang
Closes#2349 from auroraxlh/fix_OptionParser_bug
In the original implementation, console-consumer fails to honor `--value-deserializer` config.
Author: amethystic <huxi_2b@hotmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3100 from amethystic/KAFKA-5278
Keep track of when a transaction has begun by setting a flag, `transactionStarted` when a successfull `AddPartitionsToTxnResponse` or `AddOffsetsToTxnResponse` had been received. If an `AbortTxnRequest` about to be sent and `transactionStarted` is false, don't send the request and transition the state to `READY`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3126 from dguy/kafka-5260
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#3142 from hachikuji/KAFKA-5316
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3147 from vahidhashemian/minor/remove_unsed_method_parameter_simpleaclauthorizer
Add check in `KafkaApis` that the inter broker protocol version is at least `KAFKA_0_11_0_IV0`, i.e., supporting transactions
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#3103 from dguy/kafka-5128
The previous code did not handle this correctly if a batch was
compacted more than once.
Also add test case for duplicate check after log cleaning and
improve various comments.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3145 from hachikuji/minor-improve-base-sequence-docs
remove transactions that have not been updated for at least `transactional.id.expiration.ms`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Apurva Mehta, Guozhang Wang
Closes#3101 from dguy/kafka-5279
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#3123 from hachikuji/KAFKA-4935
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3133 from hachikuji/minor-replica-manager-append-refactor
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>
Closes#3075 from hachikuji/KAFKA-5259-FIXED
Clarify the consumer group command help message around `zookeeper`, `bootstrap-server`, and `new-consumer` options.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2046 from vahidhashemian/minor/improve_consumer_group_command_doc
Before this patch the consumer would return the cached offsets for partitions in its current assignment. This worked when all the offset commits went through the consumer.
With KIP-98, offsets can be committed transactionally through the producer. This means that relying on cached positions in the consumer returns incorrect information: since commits go through the producer, the cache is never updated.
Hence we need to update the `KafkaConsumer.committed` method to always lookup the server for the last committed offset to ensure it gets the correct information every time.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson, Guozhang Wang
Closes#3119 from apurvam/KAFKA-5273-kafkaconsumer-committed-should-always-hit-server
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Onur Karaman <okaraman@linkedin.com>
Closes#2983 from ijuma/kafka-5135-controller-health-metrics-kip-143
This ticket is all about ControllerContext initialization and teardown. The key points are:
1. we should teardown ControllerContext during resignation instead of waiting on election to fix it up. A heapdump shows that the former controller keeps pretty much all of its ControllerContext state laying around.
2. we don't properly teardown/reset ControllerContext.partitionsBeingReassigned. This can cause problems when the former controller becomes re-elected as controller at a later point in time.
Suppose a partition assignment is initially R0. Now suppose a reassignment R1 gets stuck during controller C0 and an admin tries to "undo" R1 (by deleting /admin/partitions_reassigned, deleting /controller, and submitting another reassignment specifying R0). The new controller C1 may succeed with R0. If the controller moves back to C0, it will then reattempt R1 even though that partition reassignment has been cleared from zookeeper prior to shifting the controller back to C0. This results in the actual partition reassignment in zookeeper being unexpectedly changed back to R1.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#3122 from onurkaraman/KAFKA-5310
Two major changes plus one minor change:
0. change stateLock to a read-write lock.
1. Put the check of "isCoordinator" and "coordinatorLoading" together with the return of the metadata, under one read lock block, since otherwise we can get incorrect behavior if there is a change in the metadata cache after the check but before the accessing of the metadata.
2. Grab the read lock right before trying to append to local txn log, and until the local append returns; this is to avoid the scenario that the epoch has actually changed when we are appending to local log (e.g. emigration followed by immigration).
3. only watch on txnId instead of txnId and txnPartitionId in the txn marker purgatory, and disable reaper thread, as we can now safely clear all the delayed operations by traversing the marker queues.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Jason Gustafson, Jun Rao
Closes#3082 from guozhangwang/K5231-read-write-lock
`TransactionCoordinatorIntegrationTest` is not covering anything that isn't already covered by the more complete `TransactionsTest`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3128 from dguy/minor-remove-test
With this patch, offset commits are always materialized according to the order of the commit records in the offsets topic.
Before this patch, transactional offset commits were materialized in transaction order. However, the log cleaner will always preserve the record with the greatest offset. This meant that if there was a mix of offset commits from a consumer and a transactional producer, then it we would switch from transactional order to offset order after cleaning, resulting in an inconsistent state.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3108 from apurvam/KAFKA-5247-materialize-committed-offsets-in-offset-order
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3118 from hachikuji/disallow-transactional-idempotent-downconversion
`shutdownIdleFetcherThreads()` can throw InterruptedException
for example.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3096 from ijuma/kafka-5289-stop-replica-should-not-send-two-responses
We should retry AddPartitionsToTxnRequest and TxnOffsetCommitRequest when receiving an UNKNOWN_TOPIC_OR_PARTITION error.
As described in the JIRA: It turns out that the `UNKNOWN_TOPIC_OR_PARTITION` is returned from the request handler in KafkaAPis for the AddPartitionsToTxn and the TxnOffsetCommitRequest when the broker's metadata doesn't contain one or more partitions in the request. This can happen for instance when the broker is bounced and has not received the cluster metadata yet.
We should retry in these cases, as this is the model followed by the consumer when committing offsets, and by the producer with a ProduceRequest.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3094 from apurvam/KAFKA-5269-handle-unknown-topic-partition-in-transaction-manager
Summary:
- add `reconnect.backoff.max.ms` common client configuration parameter
- if `reconnect.backoff.max.ms` > `reconnect.backoff.ms`, apply an exponential backoff policy
- apply +/- 20% random jitter to smooth cluster reconnects
Author: Dana Powers <dana.powers@gmail.com>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Roger Hoover <roger.hoover@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1523 from dpkp/exp_backoff
- Avoid unnecessary inner methods
- Remove redundant parameter in `sendResponseExemptThrottle`
- Go through `sendResponseExemptThrottle` for produce requests with acks=0
- Tighten how we handle cases where there’s no response
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3087 from ijuma/kafka-apis-improvements
Perform down conversion after throttling to avoid retaining
messages in memory during throttling since this could result
in OOM. Also update bytesOut metrics after throttling.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3068 from rajinisivaram/KAFKA-5250
Use IBM Kerberos module for SASL tests if running on IBM JDK
Developed with edoardocomar
Based on https://github.com/apache/kafka/pull/738 by rajinisivaram
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>, Edoardo Comar <ecomar@uk.ibm.com>
Closes#2878 from mimaison/KAFKA-3070
This is a rebase version of [PR#2973](https://github.com/apache/kafka/pull/2973).
guozhangwang , please review this updated PR.
Author: umesh chaudhary <umesh9794@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3086 from umesh9794/mylocal
Today PartitionStateMachine and ReplicaStateMachine define and assert the
valid state transitions inline for each state. It's cleaner to move the
transition rules into ReplicaState/PartitionState and do the assertion at
the top of the handleStateChange.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3071 from onurkaraman/KAFKA-5258
mjsax dguy guozhangwang Could you please review the changes.
Author: Bharat Viswanadham <bharatv@us.ibm.com>
Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang
Closes#3073 from bharatviswa504/KAFKA-5210