In the `AddPartitionsToTxn` request handling, if even one partition fails authorization checks, the entire request is essentially failed. However, the `AddPartitionsToTxnResponse` today will only contain the error codes for the topics which failed authorization. It will have no error code for the topics which succeeded, making it inconsistent with other APIs.
This patch adds a new error code `OPERATION_NOT_ATTEMPTED` which is returned for the successful partitions to indicate that they were not added to the transaction.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#3204 from apurvam/KAFKA-5322-add-operation-not-attempted-for-add-partitions
We had originally increased Snappy’s block size as part of KAFKA-3704. However,
we had some issues with excessive memory usage in the producer and we reverted
it in 7c6ee8d5e.
After more investigation, we fixed the underlying reason why memory usage seemed
to grow much more than expected via KAFKA-3747 (included in 0.10.0.1).
In 0.10.2, we changed the broker to use the same classes as the producer and the
broker’s block size for Snappy was changed from 32 KB to 1KB. As reported in
KAFKA-5236, the on disk size is, in some cases, 50% larger when the data is compressed
with 1 KB instead of 32 KB as the block size.
As discussed in KAFKA-3704, it may be worth making this configurable and/or allocate
the compression buffers from the producer pool. However, for 0.11.0.0, I think the
simplest thing to do is to default to 32 KB for Snappy (the default if no block size
is provided).
I also increased the Gzip buffer size. 1 KB is too small and the default is smaller
still (512 bytes). 8 KB (which is the default buffer size for BufferedOutputStream)
seemed like a reasonable default.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3205 from ijuma/kafka-5236-snappy-block-size
This PR updates processing of console consumer's input properties.
For both old and new consumer, the value provided for `auto.offset.reset` indirectly through `consumer.config` or `consumer.property` arguments will now take effect.
For new consumer and for `key.deserializer` and `value.deserializer` properties, the precedence order is fixed to first the value directly provided as an argument, then the value provided indirectly via `consumer.property` and then `consumer.config`, and finally a default value.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1655 from vahidhashemian/KAFKA-3982
This resolved the issue with Kafka Streams skipped records sensor reporting wrong values.
Jira ticket: https://issues.apache.org/jira/browse/KAFKA-5368
The contribution is my original work and I license the work to the project under the project's open source license.
Author: Hamidreza Afzali <hrafzali@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3206 from hrafzali/KAFKA-5368_skipped-records-sensor-bug
KAFKA-5327: ConsoleConsumer should manually commit offsets for those records it really consumed. Currently it leaves this job to the automatic offset commit scheme where some unread messages will be passed if `--max-messages` is set.
Author: amethystic <huxi_2b@hotmail.com>
Author: huxi <huxi_2b@hotmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3148 from amethystic/KAFKA-5327_ConsoleConsumer_distable_autocommit
When the `SetSchemaMetadata` SMT is used to change the name and/or version of the key or value’s schema, any references to the old schema in the key or value must be changed to reference the new schema. Only keys or values that are `Struct` have such references, and so currently only these are adjusted.
This is based on `trunk` since the fix is expected to be targeted to the 0.11.1 release.
Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#3198 from rhauch/kafka-5164
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
This patch had conflicts when merged, resolved by
Committer: Ismael Juma <ismael@juma.me.uk>
Closes#2328 from vahidhashemian/KAFKA-3264
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
This patch had conflicts when merged, resolved by
Committer: Ismael Juma <ismael@juma.me.uk>
Closes#3129 from vahidhashemian/KAFKA-5282
Author: Dale Peakall <dale@peakall.net>
Reviewers: Michael André Pearce <michael.andre.pearce@me.com>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3199 from subnova/streams-extendeddeserializer
Due to the async nature of the producer, it is possible to attempt to drain a messages whose partition hasn't been added to the transaction yet. Before this patch, we considered this a fatal error. However, it is only in error if the partition isn't in the queue to be sent to the coordinator.
This patch updates the logic so that we only fail the producer if the partition would never be added to the transaction. If the partition of the batch is yet to be added, we will simply wait for the partition to be added to the transaction before sending the batch to the broker.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3202 from apurvam/KAFKA-5364-ensure-partitions-added-to-txn-before-send
- Producer sequence numbers should wrap around
- Generate a new producerId if the producer epoch would overflow
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3183 from hachikuji/KAFKA-5283
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Ewen Chesklack-Postava, Bill Bejeck, Guozhang Wang
Closes#3194 from mjsax/minor-update-docs-for-kip-123
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3195 from rajinisivaram/KAFKA-5345
changed the reflections log level to ERROR.
And tested it, now the warning logs are not shown up during the start of Kafka connect.
ewencp could you please review the changes.
Author: Bharat Viswanadham <bharatv@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#3072 from bharatviswa504/KAFKA-5229
Add a new entry in upgrade.html for `group.initial.rebalance.delay.ms`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#3189 from dguy/relabance-delay
Scenario is as follows:
1. Consumer subscribes to topic t1 and begins consuming
2. heartbeat fails as the group is rebalancing
3. ConsumerCoordinator.onJoinGroupPrepare is called
3.1 onPartitionsRevoked is called
4. consumer becomes the group leader
5. sends sync group request
6. sync group is cancelled due to disconnection
7. fetch request is sent for partitions that have previously been revoked
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3181 from dguy/kafka-5154
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3193 from mjsax/kafka-5361-add-eos-integration-tests-for-streams-api
More specifically, V2 messages are always batched (whether compressed or not) while
V0/V1 are only batched if they are compressed.
Clients like librdkafka expect to receive messages from the fetch offset when dealing with uncompressed V0/V1 messages. When converting from V2 to V0/1, we were returning all the
messages in the V2 batch.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3191 from ijuma/kafka-5360-down-converted-uncompressed-respect-offset
Without this patch, future client retries would get the `CONCURRENT_TRANSACTIONS` error code indefinitely, since the pending state wouldn't be cleared when the append to the log failed.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3184 from apurvam/KAFKA-5351-clear-pending-state-on-retriable-error
subscribed stream may not be detected by all followers until
onJoinComplete returns.
Author: Bill Bejeck <bill@confluent.io>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3157 from bbejeck/KAFKA-5226_null_pointer_source_node_deserialize
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#3172 from guozhangwang/K5350-compatibility-annotations
- Added a boolean `allow_auto_topic_creation` to MetadataRequest and
bumped the protocol version to V4.
- When connecting to brokers older than 0.11.0.0, the `allow_auto_topic_creation`
field won't be considered, so we send a metadata request for all topics
to keep the behavior consistent.
- Set `allow_auto_topic_creation` to false in the new AdminClient and
StreamsKafkaClient (which exists for the purpose of creating topics
manually); set it to true everywhere else for now. Other clients will eventually
rely on client-side auto topic creation, but that’s not there yet.
- Add `allowAutoTopicCreation` field to `Metadata`, which is used by
`DefaultMetadataUpdater`. This is not strictly needed for the new
`AdminClient`, but it avoids surprises if it ever adds a topic to `Metadata`
via `setTopics` or `addTopic`.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#3098 from ijuma/kafka-5291-admin-client-no-auto-topic-creation
This makes the case where we build the records from scratch consistent
with the case where update the batch header "in place". Thanks to
edenhill who found the issue while testing librdkafka.
The reason our tests don’t catch this is that we rely on the maxTimestamp
to compute the record level timestamps if log append time is used.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3177 from ijuma/set-base-sequence-for-log-append-time
Without it, it's possible that the assertion is checked before the exception
is thrown in the callback.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3182 from ijuma/fix-controller-failover-flakiness
Author: Mario Molina <mmolimar@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>, Michael G. Noll <michael@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3017 from mmolimar/KAFKA-5218
Adjust "importance level" and add explanation to the docs.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2855 from mjsax/minor-improve-streams-config-parameters
Also introduce TopicConfig.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3120 from cmccabe/KAFKA-5265
This should be less flaky as it has a higher timeout. I also increased the timeout
in a couple of other tests that had a very low (100 ms) timeouts.
The failure would manifest itself as:
```text
java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
at kafka.api.test.ProducerCompressionTest.testCompression(ProducerCompressionTest.scala:97)
```
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3178 from ijuma/producer-compression-test-flaky
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#3175 from hachikuji/KAFKA-5349
It sometimes fails in Jenkins like:
```text
java.lang.AssertionError: IllegalStateException was not thrown
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at kafka.controller.ControllerFailoverTest.testHandleIllegalStateException(ControllerFailoverTest.scala:86)
```
I ran it locally 100 times with no failure.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3176 from ijuma/improve-controller-failover-assert
Return UNSUPPORTED_MESSAGE_FORMAT in handleWriteTxnMarkers when a topic is not the correct message format.
Remove any TopicPartitions that have same error from those waiting for markers
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3152 from dguy/kafka-5308
Handle` rocksdb.config.setter` being set as a class name or class
instance.
Author: Tommy Becker <tobecker@tivo.com>
Author: Tommy Becker <twbecker@gmail.com>
Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang
Closes#3155 from twbecker/KAFKA-5334
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3161 from hachikuji/KAFKA-5251
- reuse decompression buffers in consumer Fetcher
- switch lz4 input stream to operate directly on ByteBuffers
- avoids performance impact of catching exceptions when reaching the end of legacy record batches
- more tests with both compressible / incompressible data, multiple
blocks, and various other combinations to increase code coverage
- fixes bug that would cause exception instead of invalid block size
for invalid incompressible blocks
- fixes bug if incompressible flag is set on end frame block size
Overall this improves LZ4 decompression performance by up to 40x for small batches.
Most improvements are seen for batches of size 1 with messages on the order of ~100B.
We see at least 2x improvements for for batch sizes of < 10 messages, containing messages < 10kB
This patch also yields 2-4x improvements on v1 small single message batches for other compression types.
Full benchmark results can be found here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd
Author: Xavier Léauté <xavier@confluent.io>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2967 from xvrl/kafka-5150
Also update the test to be simpler since we can use a mock event to simulate the issue
more easily (thanks Jun for the suggestion). This should fix two issues:
1. A transient test failure due to a NPE in ControllerFailoverTest.testMetadataUpdate:
```text
Caused by: java.lang.NullPointerException
at kafka.controller.ControllerBrokerRequestBatch.addUpdateMetadataRequestForBrokers(ControllerChannelManager.scala:338)
at kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:975)
at kafka.controller.ControllerFailoverTest.testMetadataUpdate(ControllerFailoverTest.scala:141)
```
The test was creating an additional thread and it does not seem like it was doing the
appropriate synchronization (perhaps this became more of an issue after we changed
the Controller to be single-threaded and changed the locking)
2. Setting `activeControllerId.set(-1)` in `triggerControllerMove` causes `Reelect` not to invoke `onControllerResignation`. Among other things, this causes an `IllegalStateException` to be thrown when `KafkaScheduler.startup` is invoked for the second time without the corresponding `shutdown`. We now simply call `onControllerResignation` as part of `triggerControllerMove`.
Finally, I included a few clean-ups:
1. No longer update the broker state in `onControllerFailover`. This is no longer needed
since we removed the `RunningAsController` state (KAFKA-3761).
2. Trivial clean-ups in KafkaController
3. Removed unused parameter in `ZkUtils.getPartitionLeaderAndIsrForTopics`
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2935 from ijuma/on-controller-resignation-if-trigger-controller-move