The default client-id bandwidth quota config properties have been marked deprecated in the doc, but a warning may be useful before the property is removed in a future release.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3315 from rajinisivaram/MINOR-deprecate-staticquota
Skip topics that don't have any partitions in zkUtils.getAllPartitions()
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3295 from mimaison/KAFKA-5418
Assign non-null tp as soon as possible once we know the partition. This is
so that if ensureValidRecordSize() throws, the
interceptors.onSendError() call is made with a non-null tp.
Author: Tom Bentley <tbentley@redhat.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3280 from tombentley/tp-assign
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#3298 from hachikuji/KAFKA-5428
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#3300 from hachikuji/KAFKA-5429
When the message copier hangs (like when there is a bug in the client), it ignores the sigterm and doesn't shut down. this leaves the cluster in an unclean state causing future tests to fail.
In this patch we always send SIGKILL when cleaning the node if the process isn't already dead. This is consistent with the other services.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3308 from apurvam/KAFKA-5437-force-kill-message-copier-on-cleanup
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3297 from hachikuji/KAFKA-5427
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3284 from hachikuji/minor-either-usage-cleanup
In `TransationStateManager`, we reset the pending state if an error occurred while appending to log; this is correct except that for the `TransactionMarkerChannelManager`, as it will retry appending to log and if eventually it succeeded, the transaction metadata's completing transition will throw an IllegalStateException since pending state is None, this will be thrown all the way to the `KafkaApis` and be swallowed.
1. Do not reset the pending state if the append will be retried (as is the case when write the complete transition).
2. A bunch of log4j improvements based the debugging experience. The main principle is to make sure all error codes that is about to sent to the client will be logged, and unnecessary log4j entries to be removed.
3. Also moved some log entries in ReplicationUtils.scala to `trace`: this is rather orthogonal to this PR but I found it rather annoying while debugging the logs.
4. A couple of unrelated bug fixes as pointed by hachikuji and apurvam.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#3287 from guozhangwang/KHotfix-transaction-coordinator-append-callback
Record `apiThrottleTime` in RequestChannel.
junrao A trivial change. Please review. Thanks.
Author: huxihx <huxi_2b@hotmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#3265 from huxihx/KAFKA-5405
Author: Jeyhun Karimov <je.karimov@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3288 from jeyhunkarimov/KAFKA-4661
This assertion is hard to get right because the system time can roll backward on a host due to NTP (as shown in the ticket), and also because a transaction can start on one host and complete on another. Getting precise clock times across hosts is virtually impossible, and this check makes things fragile.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3286 from apurvam/KAFKA-5415-avoid-timestamp-check-in-completeTransition
Originally we assume the task will be created exactly three times (twice upon starting up, once for each thread, and then one more time when rebalancing upon the thread failure). However there is a likelihood that upon starting up more than one rebalance will be triggered, and hence the tasks will be initialized more than 3 times, i.e. there will be more than three hashcodes of the `Transformer` object, causing the `errorInjected` to never be taken and exception never thrown.
The current fix is to use an atomic boolean instead and let threads compete on compare-and-set to make sure exactly one thread will throw exception, and will only throw once.
Without this patch I can reproduce the issue on my local machine with a single core ever 3-5 times; with this patch I have been running successfully for 10+ runs.
Ping mjsax ijuma for reviews.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#3275 from guozhangwang/KHotfix-eos-integration-test
This reverts commit d7d1196a0b.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3277 from hachikuji/KAFKA-5414
- Show AdminClient configs in the docs.
- Update Javadoc config so that public classes exposed by
the AdminClient are included.
- Version and table of contents fixes.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Colin Mccabe, Gwen Shapira
Closes#3271 from ijuma/kafka-5411-admin-client-javadoc-configs
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3201 from mjsax/kafka-5362-add-eos-system-tests-for-streams-api
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3273 from mjsax/hotfix-flaky-stream-eos-test
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3242 from guozhangwang/K5357-yammer-metrics
We need this to debug most issues with the transactions system test.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3261 from apurvam/MINOR-set-log-level-for-producer-to-trace-for-transactions-test
ijuma can you please review.
Author: Balint Molnar <balintmolnar91@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3245 from baluchicken/KAFKA-5391
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#3135 from enothereska/exceptions-stores-KAFKA-5314
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3264 from mjsax/minor-javadocs-timestamp-extractor
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Jun Rao <junrao@gmail.com>
Closes#3257 from ijuma/kafka-5329-fix-order-of-replica-list-in-metadata-cache
* NetworkClient.java: when trace logging is enabled, show AbstractResponse Struct objects, rather than just a memory address of the AbstractResponse.
* AclOperation.java: add documentation of what ACLs imply other ACLs.
* Resource.java: add CLUSTER, CLUSTER_NAME constants.
* Reconcile the Java and Scala classes for ResourceType, OperationType, etc. Add unit tests to ensure they can be converted to each other.
* AclCommand.scala: we should be able to apply ACLs containing Alter and Describe operations to Cluster resources.
* SimpleAclAuthorizer: update the authorizer to handle the ACL inheritance rules described in AclOperation.java.
* KafkaApis.scala: update createAcls and deleteAcls to use ALTER on CLUSTER, as described in the KIP. describeAcls should use DESCRIBE on CLUSTER. Use fromJava methods instead of fromString methods to convert from Java objects to Scala ones.
* SaslSslAdminClientIntegrationTest.scala: do not use AllowEveryoneIfNoAclIsFound. Add a configureSecurityBeforeServerStart hook which installs the ACLs necessary for the tests. Add a test of ACL authorization ALLOW and DENY functionality.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3240 from cmccabe/KAFKA-5292
Before this patch, we would call `producerBatch.done` directly from the accumulator when expiring batches. This meant that we would not transition to the `ABORTABLE_ERROR` state in the transaction manager, allowing other transactional requests (including Commits!) to go through, even though the produce failed.
This patch modifies the logic so that we call `Sender.failBatch` on every expired batch, thus ensuring that the transaction state is accurate.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#3252 from apurvam/KAFKA-5385-fail-transaction-if-batches-expire
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3259 from hachikuji/group-coordinator-logging-improvements
ijuma can you please review
Author: Balint Molnar <balintmolnar91@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3243 from baluchicken/KAFKA-5389
KAFKA-5394; Fix disconnections due to timeouts in AdminClient
* Create KafkaClient#disconnect to tear down a connection and
deliver disconnects to all the requests on it.
* AdminClient.java: fix mismatched braces in JavaDoc.
* Make the AdminClientConfig constructor visible for testing.
* KafkaAdminClient: add TimeoutProcessorFactory to make the
TimeoutProcessor swappable for testing.
* Make TimeoutProcessor a static class rather than an inner
class.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3250 from cmccabe/KAFKA-5394
The log_level parameter is used in system tests in kafka.py. However the log4j template accepted that parameter in only one place. This led to a large number of DEBUG lines printed even when the intention was to capture only INFO lines. Led to huge log files. Thanks to ijuma for noticing this.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3247 from enothereska/minor-log4j-template-fix
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3248 from hachikuji/KAFKA-5378
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Derrick Or <derrickor@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3214 from guozhangwang/KMinor-doc-java-brush
override the setting of `group.initial.rebalance.delay` in server.properties
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3254 from dguy/minor-prop-change
1. Fix ordering of metadata update request for regex subscription to avoid timing issue when heartbeat thread updates metadata
2. Override metadata cluster in MockClient for `KafkaConsumer#testChangingRegexSubscription` to avoid timing issues during update
3. Close consumer in all KafkaConsumer tests since they leave behind heartbeat threads.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#3238 from rajinisivaram/KAFKA-5380
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Apurva Mehta <apurva@confluent.io>
Closes#3251 from guozhangwang/KMinor-remove-ignored-streams-integration-test
This currently fails in multiple ways. One of which is most likely KAFKA-5355, where the concurrent consumer reads duplicates.
During broker bounces, the concurrent consumer misses messages completely. This is another bug.
Author: Apurva Mehta <apurva@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#3217 from apurvam/KAFKA-5366-add-concurrent-reads-to-transactions-system-test
Expose the streams commit interval config for different testing scenarios.
Author: Bill Bejeck <bill@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#3244 from bbejeck/MINOR_add_commit_config_benchmark
Add a section in the streams docs about the broker config introduced in KIP-134
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Michael G. Noll, Bill Bejeck, Guozhang Wang
Closes#3179 from dguy/kip134-doc
`DescribeConsumerGroupTest#testDescribeExistingGroupWithNoMembersWithNewConsumer` shuts down the consumer executor thread and then checks that the assignments returned by `describeGroup` contain the consume group with no members. But if the executor thread is shut down before any offsets are committed, the assignments returned by `describeGroup` doesn't contain the group at all. This PR waits for an offset commit by waiting for the group to appear in `describeGroup` assignments prior to shutting down the executor.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#3246 from rajinisivaram/KAFKA-4948
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#3231 from hachikuji/KAFKA-5364
This should fix transient failures due to timeouts caused by slow
auto creation of the offsets topic. The one exception is
`testDescribeGroupWithNewConsumerWithShortInitializationTimeout` where
we want initialisation to take longer than the timeout so we let
it be auto created. That test has also been a bit flaky and I reduced
the timeout to 1 ms.
Also:
- Simplified resource handling
- Removed usage of EasyMock that didn't make sense.
- `findCoordinator` should retry if `sendAnyNode` throws an exception
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#3237 from ijuma/kafka-49480-describe-consumer-group-test-failure
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2835 from rajinisivaram/KAFKA-5051