Rewrite ReassignPartitionsCommand to use the KIP-455 API when possible, rather
than direct communication with ZooKeeper. Direct ZK access is still supported,
but deprecated, as described in KIP-455.
As specified in KIP-455, the tool has several new flags. --cancel stops
an assignment which is in progress. --preserve-throttle causes the
--verify and --cancel commands to leave the throttles alone.
--additional allows users to execute another partition assignment even
if there is already one in progress. Finally, --show displays all of
the current partition reassignments.
Reorganize the reassignment code and tests somewhat to rely more on unit
testing using the MockAdminClient and less on integration testing. Each
integration test where we bring up a cluster seems to take about 5 seconds, so
it's good when we can get similar coverage from unit tests. To enable this,
MockAdminClient now supports incrementalAlterConfigs, alterReplicaLogDirs,
describeReplicaLogDirs, and some other APIs. MockAdminClient is also now
thread-safe, to match the real AdminClient implementation.
In DeleteTopicTest, use the KIP-455 API rather than invoking the reassignment
command.
Currently when there is a leader change with a log dir reassignment in progress, we do not update the leader epoch in the partition state maintained by `ReplicaAlterLogDirsThread`. This can lead to a FENCED_LEADER_EPOCH error, which results in the partition being marked as failed, which is a permanent failure until the broker is restarted. This patch fixes the problem by updating the epoch in `ReplicaAlterLogDirsThread` after receiving a new LeaderAndIsr request from the controller.
Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>
- part of KIP-447
- commit all tasks at once using non-eos (and eos-beta in follow up work)
- unified commit logic into TaskManager
- split existing methods of Task interface in pre/post parts
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Add 4 new assignor configs in preparation for the new assignment algorithm:
1. acceptable.recovery.lag
2. balance.factor
3. max.warmup.replicas
4. probing.rebalance.interval.ms
Implements: KIP-441
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
Since the assignment info includes a map with all member's host info, we can just check the received map to make sure our endpoint is contained. If not, we need to force the group to rebalance and get our updated endpoint info.
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
The issue itself has been fixed a while ago on the producer side, so we can just remove this TODO marker now (we've removed the isZombie flag already anyways).
Reviewers: John Roesler <vvcephei@apache.org>
The `TxnOffsetCommit` API suffers from a bug affecting older client versions which treat `COORDINATOR_LOAD_IN_PROGRESS` errors as fatal. This PR changes the handling on the broker to instead return `COORDINATOR_NOT_AVAILABLE` in this case so that clients won't crash upon doing txn commit.
Reviewers: Jason Gustafson <jason@confluent.io>
Previously, `AdminClient` group operations did not respect a `Call`'s number of configured tries and retry backoff. This could lead to tight retry loops that put a lot of pressure on the broker. This PR introduces fixes that ensures for all group operations the `AdminClient` respects the number of tries and the backoff a given `Call` has.
Reviewers: Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>
Highlights:
* Performance improvements in the ollections
library: algorithmic improvements and
changes to avoid unnecessary allocations.
* Performance improvements in the compiler.
* ASM was upgraded to 7.3.1, allowing the
optimizer to run on JDK 13+.
Full release notes: https://github.com/scala/scala/releases/tag/v2.12.11
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
* Broker throttles were incorrectly marked as sensitive configurations. Fix this, so that their values can be returned via DescribeConfigs as expected.
* Previously, changes to broker configs that consisted only of deletions were ignored by the brokers because of faulty delta calculation logic that didn't consider deletions as changes, only alterations as changes. Fix this and add a regression test.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
No logical or behavioral changes, just a bit of cleanup in this class before we have to write and fix a lot of these tests for KIP-441:
* Moved creation of streamsMetadata mock to setUp (in exactly one test it will be overwritten with a strict mock)
* Tried to clean up the use of helper methods for configuring the assignor.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
StateDirectoryTest.shouldReturnEmptyArrayIfListFilesReturnsNull always moves the stage dir to /tmp/state-renamed so it always fails if there is already a folder (for example, the stuff leaved by previous test).
Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
In prepareAddPartitions the txnStartTimestamp could be updated as updateTimestamp, which is assumed to be always larger then the original startTimestamp. However, due to ntp time shift the timer may go backwards and hence the newStartTimestamp be smaller than the original one. Then later in completeTransitionTo the time check would fail with an IllegalStateException, and the txn would not transit to Ongoing.
An indirect result of this, is that this txn would NEVER be expired anymore because only Ongoing ones would be checked for expiration.
We should do the same as in #3286 to remove this check.
Also added test coverage for both KAFKA-5415 and KAFKA-8803.
Reviewers: Jason Gustafson<jason@confluent.io>
I have seen an increased incidence in StackOverflowError(s) when compiling scala. This
change doubles the max stack size to 4m.
```
> Task :core:compileScala FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':core:compileScala'.
> java.lang.StackOverflowError (no error message)
```
Reviewers: Andrew Choi <a24choi@edu.uwaterloo.ca>, Ismael Juma <ismael@juma.me.uk>
Adds tests for edge conditions of listAllTaskDirectories
Also includes some minor cleanup of the StateDirectoryTest class
Reviewers: Guozhang Wang <wangguoz@gmail.com>
The test was broken by commit 227a7322b77840e08924b9486e4bda2f3dfc1f1a.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
This commit works around a bug in version v0.9.12 of the upstream `reflections` library by catching and handling the exception thrown.
The reflections issue is tracked by:
https://github.com/ronmamo/reflections/issues/273
New unit tests were introduced to test the behavior.
* KAFKA-9712: Catch and handle exception thrown by reflections scanner
* Update connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java
Co-Authored-By: Konstantine Karantasis <konstantine@confluent.io>
* Move result initialization back to right before it is used
* Use `java.io.File` in tests
* Fix checkstyle
Co-authored-by: Konstantine Karantasis <konstantine@confluent.io>
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
This PR removes intermediate conversions between `MetadataResponse.TopicMetadata` => `MetadataResponseTopic` and `MetadataResponse.PartitionMetadata` => `MetadataResponsePartition` objects.
There is 15-20% reduction in object allocations and 5-10% improvement in metadata request performance.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson<jason@confluent.io>
While discussing KIP-441 we realize we don't strictly enforce that all checkpointed offset sums are positive (or 0, though there's not much point to checkingpoint a 0 offset is there)?
Rather than awkwardly try handle this within every user/reader of the checkpoint file, we should just make a guarantee that all returned checkpointed offsets are positive.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
As a developer, it would be convenient if the generated
{request,response}HeaderVersion case statements in ApiMessageType.java
included a comment to remind me which type each of them is so I don't
need to manually cross-reference the newer/rarer ones.
Also include commented lines for the two special cases around
ApiVersionsResponse and ControllerShutdownRequest which are hardcoded in
the ApiMessageTypeGenerator.java and not covered by the message format
json files.
Before:
```java
public short requestHeaderVersion(short _version) {
switch (apiKey) {
case 0:
return (short) 1;
case 1:
return (short) 1;
case 2:
return (short) 1;
case 3:
if (_version >= 9) {
return (short) 2;
} else {
return (short) 1;
}
// ...etc
```
After:
```java
public short requestHeaderVersion(short _version) {
switch (apiKey) {
case 0: // Produce
return (short) 1;
case 1: // Fetch
return (short) 1;
case 2: // ListOffset
return (short) 1;
case 3: // Metadata
if (_version >= 9) {
return (short) 2;
} else {
return (short) 1;
}
// ...etc
```
Signed-off-by: Dominic Evans <dominic.evans@uk.ibm.com>
Reviewers: Mickael Maison <mickael.maison@gmail.com>
1. Inside StateDirectory#cleanRemovedTasks, skip deleting the lock file (and hence the parent directory) until releasing the lock. And after the lock is released only go ahead and delete the parent directory if manualUserCall == true. That is, this is triggered from KafkaStreams#cleanUp and users are responsible to make sure that Streams instance is not started and hence there are no other threads trying to grab that lock.
2. As a result, during scheduled cleanup the corresponding task.dir would not be empty but be left with only the lock file, so effectively we still achieve the goal of releasing disk spaces. For callers of listTaskDirectories like KIP-441 (cc @ableegoldman to take a look) I've introduced a new listNonEmptyTaskDirectories which excludes such dummy task.dirs with only the lock file left.
3. Also fixed KAFKA-8999 along the way to expose the exception while traversing the directory.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
When we changed quota communication with KIP-219, fetch requests get throttled by returning empty response with the delay in throttle_time_ms and Kafka consumer retries again after the delay. With default configs, the maximum fetch size could be as big as 50MB (or 10MB per partition). The default broker config (1-second window, 10 full windows of tracked bandwidth/thread utilization usage) means that < 5MB/s consumer quota (per broker) may block consumers from being able to fetch any data.
This PR ensures that consumers cannot get blocked by quota by capping fetchMaxBytes in KafkaApis.handleFetchRequest() to quota window * consume bandwidth quota. In the example of default configs (10-second quota window) and 1MB/s consumer bandwidth quota, fetchMaxBytes would be capped to 10MB.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Reuse the same pseudo-topic for serializing the LHS value in the foreign-key join resolver as
we originally used to serialize it before sending the subscription request.
Reviewers: Boyang Chen <boyang@confluent.io>
KIP-441 Pt. 2: Compute sum of offsets across all stores/changelogs in a task and include them in the subscription.
Previously each thread would just encode every task on disk, but we now need to read the changelog file which is unsafe to do without a lock on the task directory. So, each thread now encodes only its assigned active and standby tasks, and ignores any already-locked tasks.
In some cases there may be unowned and unlocked tasks on disk that were reassigned to another instance and haven't been cleaned up yet by the background thread. Each StreamThread makes a weak effort to lock any such task directories it finds, and if successful is then responsible for computing and reporting that task's offset sum (based on reading the checkpoint file)
This PR therefore also addresses two orthogonal issues:
1. Prevent background cleaner thread from deleting unowned stores during a rebalance
2. Deduplicate standby tasks in subscription: each thread used to include every (non-active) task found on disk in its "standby task" set, which meant every active, standby, and unowned task was encoded by every thread.
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
To dismiss the usage of operation ++ against Set which is slow when Set has many entries. This pr introduces a new class 'AclSets' which takes multiple Sets as parameters and do 'find' against them one by one. For more details about perf and benchmark, refer to [KAFKA-9685](https://issues.apache.org/jira/browse/KAFKA-9685)
Author: jiao <jiao.zhang@linecorp.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#8261 from jiao-zhangS/jira-9685
This PR tries to add an internal flag to throw if we hit an unexpected protocol version for offset fetch. It could be used together with EOS_BETA flag so that if server side downgrades unexpectedly, we shall fail the application ASAP.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Problem
----
The `incrementalAlterConfigs` API supports OpType.APPEND and OpType.SUBTRACT for configuration properties of LIST type. If an APPEND or SUBTRACT OpType is submitted for a config property which currently has no value, then the operation fails with a NullPointerException on the broker side (conveyed as an "unknown server error" to the client).
This is because the alter code does a `getProperty` of the existing configuration value
with no concern as to whether or not the property actually exists.
This change handles the case of existing null properties.
Testing
-----
This change includes 2 test cases in the unit test that demonstrate the issue for OpType.SUBTRACT and OpType.APPEND.
Author: Steve Rodrigues <srodrigues@confluent.io>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Bob Barrett <bob.barrett@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#8216 from steverod/steverod.kafka-9644
Throw InvalidRequestException if null configs are specified for CreateTopics, AlterConfigs or IncrementalAlterConfigs.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
We detected a bug in soak where the producer batches shall be failed in sender loop before the produce response callback. This shall trigger an illegal state exception on the producer batch as it is already aborted.
The impact is not severe since sender is on its own thread but should be fixed to avoid unnecessary critical exception.
Reviewers: Bob Barrett <bob.barrett@confluent.io>, Jason Gustafson <jason@confluent.io>