The toString() for ConfigResource was using { } instead of ( ) which is inconsistent with the existing toStrings in the code, while toString for Resource was using a mix of ( and }.
This patch fixes an edge case in producer shutdown which prevents `close()` from completing due to a pending request which will never be sent due to shutdown initiation. I have added a test case which reproduces the scenario.
Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Implementation of KIP-86. Client, server and login callback handlers have been made configurable for both brokers and clients.
Reviewers: Jun Rao <junrao@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
The invalid topic name is already handled locally so it is unnecessary to send the DeleteTopicsRequest. This PR adds a count to MockClient for testing.
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jason Gustafson <jason@confluent.io>
Ignore headers when down-converting to V0/V1 since they are not supported. Added a test-case to verify down-conversion sanity in presence of headers.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
In the group coordinator, we currently check whether the partition is owned before checking whether it is loading. Since loading is a prerequisite for partition ownership, it means that it is not actually possible to see the COORDINATOR_LOAD_IN_PROGRESS error. The impact is mostly harmless: while loading the group, the client may send unnecessary FindCoordinator requests to rediscover the coordinator. I've fixed the bug and restructured the code to enable testing.
In the process of fixing this bug, the following improvements have been made:
1. We now verify valid groupId in all request handlers.
2. Currently if the coordinator is loading when a SyncGroup is received, we'll return NOT_COORDINATOR. I've changed this to return REBALANCE_IN_PROGRESS since the rebalance state will have been lost on coordinator failover. This effectively forces the consumer to rejoin the group, which seems preferable over unnecessarily rediscovering the coordinator.
3. I added a check for the COORDINATOR_LOAD_IN_PROGRESS handler in SyncGroup. Although we do not currently return this error, it seems reasonable that we might want to some day, so it seems better to get the check in now.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Currently the `initTransactions()` API blocks indefinitely if the broker cannot be reached. This patch changes the behavior to raise a `TimeoutException` after waiting for `max.block.ms`.
Reviewers: Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
DynamicBrokerReconfigurationTest currently assumes that passwords encoded with one secret will fail with an exception if decoded with another secret and configures an old.secret in setUp. This could potentially cause test failures if a password was incorrectly decoded with the wrong secret, since the test writes passwords encoded with the new secret directly to ZooKeeper. Since old.secret is only used in one test for verifying secret rotation, this config can be moved to that test to avoid transient failures.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Added configs to ProducerBenchSpec:
topicPrefix: name of topics will be of format topicPrefix + topic index. If not provided, default is "produceBenchTopic".
partitionsPerTopic: number of partitions per topic. If not provided, default is 1.
replicationFactor: replication factor per topic. If not provided, default is 3.
The behavior of producer bench is changed such that if some or all topics already exist (with topic names = topicPrefix + topic index), and they have the same number of partitions as requested, the worker uses those topics and does not fail. The producer bench fails if one or more existing topics has number of partitions that is different from expected number of partitions.
Added unit test for WorkerUtils -- for existing methods and new methods.
Fixed bug in MockAdminClient, where createTopics() would over-write existing topic's replication factor and number of partitions while correctly completing the appropriate futures exceptionally with TopicExistsException.
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
This is a general change and is re-requisite to allow streams benchmark test with different streams tests. For the streams benchmark itself I will have a separate PR for switching configs. Details:
1. Create a "streams.properties" file under PERSISTENT_ROOT before all the streams test. For now it will only contain a single config of state.dir pointing to PERSISTENT_ROOT.
2. For all the system test related code, replace the main function parameter of state.dir with propsFilename, then inside the function load the props from the file and apply overrides if necessary.
3. Minor fixes.
Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
KafkaFutureImpl#addWaiter should be protected, just like KafkaFuture#addWaiter. As described in KIP-218, whenComplete is the public API, not addWaiter.
Use the exact first offset of message set when rolling log segment. This is possible to do for message format V2 and beyond without any performance penalty, because we have the first offset stored in the header. This augments the fix made in KAFKA-4451 to avoid using the heuristic for V2 and beyond messages.
Added unit tests to simulate cases where segment needs to roll because of overflow in index offsets. Verified that the new segment created in these cases uses the first offset, instead of the heuristic in use previously.
Remove unnecessary null check in StringDeserializer, MockProducerInterceptor and KStreamImpl.
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>
Currently in KafkaAdminClient.describeTopics(), for each topic in the request, a complete map of cluster and errors will be constructed for every topic and partition. This unnecessarily increases the complexity of describeTopics() to O(n^2). This patch improves the complexity to O(n).
Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin Patrick McCabe <colin@cmccabe.xyz>, Jason Gustafson <jason@confluent.io>
Now that we have augmented WindowSerde with non-arg parameters, extract it out as part of the public APIs so that users who want to I/O windowed streams can use it. This is originally introduced by @vitaly-pushkar
This PR grows out to be a much larger one, as I found a few tech debts and bugs while working on it. Here is a summary of the PR:
Public API changes (I will propose a KIP after a first round of reviews):
Add TimeWindowedSerializer, TimeWindowedDeserializer, SessionWindowedSerializer, SessionWindowedDeserializer into o.a.k.streams.kstream. The serializers would implemented an internal WindowedSerializer interface for the serializeBaseKey function used in 3) below.
Add WindowedSerdes into o.a.k.streams.kstream. The reason to now add them into o.a.k.clients's Serdes is that it then needs dependency of streams.
Add "default.windowed.key.serde.inner" and "default.windowed.value.serde.inner" into StreamsConfig, used when "default.key.serde" is specified to use time or session windowed serde. Note this requires the serde class, not the type class.
Consolidated serde format from multiple classes, including SessionKeySerde.java for session, and WindowStoreUtils for time window, into SessionKeySchema and WindowKeySchema.
Bug fix: WindowedStreamPartitioner needs to consider both time window and session window serdes.
Removed RocksDBWindowBytesStore etc optimization since after KIP-182 all the serde know happens on metered store, hence this optimization is not worth.
Bug fix: for time window, the serdes used for store and the serdes used for piping (source and sink node) are different: the former needs to append sequence number but not for the later.
Other minor cleanups: remove unnecessary throws, etc.
Authors: Guozhang Wang <wangguoz@gmail.com>, Vitaly Pushkar <vitaly.pushkar@gmail.com>
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Xi Hu
Logging can get spammy during the reconnect blackout period because any requests we send to ConsumerNetworkClient will immediately be failed when poll() returns. This patch checks for connection failures prior to sending fetches and offset lookups and skips sending to any failed nodes. Test cases added for both.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
NetworkClient should use FIFO order when completing inflight requests following a disconnect.
I've added new unit tests for `InFlightRequests` and `NetworkClient` which verify completion order.
Reviewers: Jun Rao <junrao@gmail.com>
We need to reset the auto-commit deadline after sending the offset commit request so that we do not resend it while the request is still inflight.
Added unit tests ensuring this behavior and proper backoff in the case of a failure.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Contention for the lock in ConsumerNetworkClient can lead to a livelock situation in which an active commitSync is unable to make progress because its completion is blocked in the heartbeat thread. The fix is twofold:
1) We change ConsumerNetworkClient to use a fair lock to reduce the chance of each thread getting starved.
2) We eliminate the dependence on the lock in ConsumerNetworkClient for callback completion so that callbacks will not be blocked by an active poll().
Reviewers: Guozhang Wang <wangguoz@gmail.com>
The fix is in two folds:
For tasks that's closed in closeZombieTask, their corresponding partitions are still in runningByPartition so those closed tasks may still be returned in activeTasks and standbyTasks. Adding guards on the returned tasks and if they are closed notify the thread to trigger rebalance immediately.
When triggering a rebalance, un-subscribe and re-subscribe immediately to make sure we are not dependent on the background heartbeat thread timing.
Some minor changes on log4j. More specifically, I moved the log entry of closeZombieTask to its callers with more context information and the action going to take.
I can re-produce the issue with EosIntegrationTest may hand-code the heartbeat thread to GC, and confirmed this patch fixed the issue. Unfortunately this test cannot be added to AK since currently we do not have ways to manipulate the heartbeat thread in unit tests.
Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
fixes lgmt.com warnings
cleanup PrintForeachAction and Printed
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Sebastian Bauersfeld <sebastianbauersfeld@gmx.de>, Damian Guy <damian@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
This patch fixes a bug in the validation of the inter-broker protocol and the message format version. We should allow the configured message format api version to be greater than the inter-broker protocol api version as long as the actual message format versions are equal. For example, if the message format version is set to 1.0, it is fine for the inter-broker protocol version to be 0.11.0 because they both use message format v2.
I have added a unit test which checks compatibility for all combinations of the message format version and the inter-broker protocol version.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#4583 from hachikuji/KAFKA-6328-REOPENED
Add validation checks that the offset range is valid and aligned with the batch count prior to appending to the log. Several unit tests have been added to verify the various invalid cases.
Make it clear in the docs that the rebalance listener is only invoked during an active call to `poll()`. Plus a few additional doc cleanups.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Currently we hold onto all Records references in a multi-partition fetch response until the full response has completed. This can be a problem when the records have been down-converted since they will be occupying a (potentially large) chunk of memory. This patch changes the behavior in MultiSend so that once a Send is completed, we no longer keep a reference to it, which will allow the Records objects to be freed sooner.
I have added a simple unit test to verify that sends are removed as the MultiSend progresses.
Reviewers: Ismael Juma <ismael@juma.me.uk>
`Node` is immutable so this is safe.
With 100 brokers, 150 topics and 350 partitions, `HashSet.contains` in `RecordAccumulator.ready` took about 40% of the application time. It
is caused by re-calculating a hash code of a leader (Node instance) for
every batch entry. Caching the hashCode reduced the time of
`HashSet.contains` in `RecordAccumulator.ready` to ~2%. The
measurements were taken with Flight Recorder.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ted Yu <yuzhihong@gmail.com>, Ismael Juma <ismael@juma.me.uk>
ProducerBatch retains references to MemoryRecordsBuilder and cannot be freed until acks are received. Removing references to buffers used for compression after records are built will enable these to be garbage collected sooner, reducing the risk of OOM.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Lothsahn <Lothsahn@gmail.com>
This fixes two alerts flagged on lgtm.com for Apache Kafka.
This dead code alert where InvalidTypeIdException indirectly extends JsonMappingException. The flagged condition with the type test appears after the type test for the latter and thus makes its body dead. I opted to change the order of the tests. Please let me know if this is the intended behavior.
The second commit addresses this out-of-bounds alert.
More alerts can be found here. Note that my colleague Aditya Sharad addressed some of those in the now outdated #2939.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
Added a second check for race condition where store changelog topic updated during restore, but not if a KTable changelog topic. This will be tricky to test, but I wanted to push the PR to get feedback on the approach.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
Prior to this patch, the consumer always blocks in poll() if there are any partitions which are awaiting their initial positions. This behavior was inconsistent with normal fetch behavior since we allow fetching on available partitions even if one or more of the assigned partitions becomes unavailable _after_ initial offset lookup. With this patch, the consumer will do offset resets asynchronously, which allows other partitions to make progress even if the initial positions for some partitions cannot be found.
I have added several new unit tests in `FetcherTest` and `KafkaConsumerTest` to verify the new behavior. One minor compatibility implication worth mentioning is apparent from the change I made in `DynamicBrokerReconfigurationTest`. Previously it was possible to assume that all partitions had a fetch position after `poll()` completed with a non-empty assignment. This assumption is no longer generally true, but you can force the positions to be updated using the `position()` API which still blocks indefinitely until a position is available.
Note that this this patch also removes the logic to cache committed offsets in `SubscriptionState` since it was no longer needed (the consumer's `committed()` API always does an offset lookup anyway). In addition to avoiding the complexity of maintaining the cache, this avoids wasteful offset lookups to refresh the cache when `commitAsync()` is used.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
* do not use static properties
* use new object to take appID
* capture timeout exception inside condition
Reviewers: Matthias J. Sax <matthias@confluent.io>