We no longer need them since we now require Java 8.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Andras Beni <andrasbeni@cloudera.com>, Manikumar Reddy O <manikumar.reddy@gmail.com>, Dong Lin <lindong28@gmail.com>
Closes#5049 from ijuma/remove-base64
* The consumer groups API should expose group state and coordinator information. This information is needed by administrative tools and scripts that access consume groups.
* The partition assignment will be empty when the group is rebalancing. Fix an issue where the adminclient attempted to deserialize this empty buffer.
* Remove nulls from the API and make all collections immutable.
* DescribeConsumerGroupsResult#all should return a result as expected, rather than Void
* Fix exception text for GroupIdNotFoundException, GroupNotEmptyException. It was being filled in as "The group id The group id does not exist was not found" and similar.
Reviewers: Attila Sasvari <asasvari@apache.org>, Andras Beni <andrasbeni@cloudera.com>, Dong Lin <lindong28@gmail.com>, Jason Gustafson <jason@confluent.io>
This patch adds a few metrics that are useful for monitoring controller health. See KIP-237 for more detail.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#4392 from lindong28/KAFKA-3473
Removed usage of deprecated AdminClient from StreamsResetter
No additional tests are required.
Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Also include a few clean-ups:
* Method/variable/parameter renames to make them consistent with
the class name
* Return `ApiVersion` from `minSupportedFor`
* Use `values` to remove some code duplication
* Reduce duplication in `ApiVersion` by introducing the `shortVersion`
method and building the versions map programatically
* Avoid unnecessary `regex` in `ApiVersion.apply`
* Added scaladoc to a few methods
Some of these were originally discussed in:
https://github.com/apache/kafka/pull/4583#pullrequestreview-98089400
Added a test for `ApiVersion.shortVersion`. Relying on existing tests
for the rest since there is no change in behaviour.
Reviewers: Jason Gustafson <jason@confluent.io>
Implementation of KIP-279 as described here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-279%3A+Fix+log+divergence+between+leader+and+follower+after+fast+leader+fail+over
In summary:
- Added leader_epoch to OFFSET_FOR_LEADER_EPOCH_RESPONSE
- Leader replies with the pair( largest epoch less than or equal to the requested epoch, the end offset of this epoch)
- If Follower does not know about the leader epoch that leader replies with, it truncates to the end offset of largest leader epoch less than leader epoch that leader replied with, and sends another OffsetForLeaderEpoch request. That request contains the largest leader epoch less than leader epoch that leader replied with.
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Grow buffers in log cleaner to hold one message set after sanity check even if message set is bigger than max.message.bytes.
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Fixes a deadlock between the controller's beforeInitializingSession callback which holds the zookeeper client initialization lock while awaiting completion of an asynchronous event which itself depends on the same lock.
Also catch and log callback exceptions to ensure the ZooKeeper reconnection takes place.
Finally, configure KafkaScheduler in ZooKeeperClient to have at least 1 thread.
Added tests that fail or hang without the changes in this PR.
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
The leader must explicitly check if requested leader epoch is undefined, and return undefined offset so that the follower can fall back to truncating to high watermark. Otherwise, if the leader also is not tracking leader epochs, it may return its LEO, which will the follower to truncate to the incorrect offset.
Log cleaner grows buffers when result.messagesRead is zero. This contains the number of filtered messages read from source which can be zero when transactions are used because batches may be discarded. Log cleaner incorrectly assumes that messages were not read because the buffer was too small and attempts to double the buffer size unnecessarily, failing with an exception if the buffer is already max.message.bytes. Additional check for discarded batches has been added to avoid growing buffers when batches are discarded.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Enable dynamic update of default unclean leader election config of brokers. A new controller event has been added to process unclean leader election when the config is enabled dynamically.
Reviewers: Dong Lin <lindong28@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Prevent exception thrown by metric reporters to impact request processing and other reporters.
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Added unit tests for ReplicaAlterLogDirsThread. Mostly focused on unit tests for truncating logic.
Fixed ReplicaAlterLogDirsThread.buildLeaderEpochRequest() to use future replica's latest epoch (not the latest epoch of replica it is fetching from). This follows the logic that offset for leader epoch request should be based on leader epoch of the follower (in this case it's the future local replica).
Also fixed PartitionFetchState constructor that takes offset and delay. The code ignored the delay parameter and used 0 for the delay. This constructor is used only by another constructor which passes delay = 0, which luckily works.
Author: Anna Povzner <anna@confluent.io>
Reviewers: Dong Lin <lindong28@gmail.com>
Closes#4918 from apovzner/kafka-6795
Currently if the client sends a produce request or a fetch request to a broker which isn't a replica,
we return UNKNOWN_TOPIC_OR_PARTITION. This is a bit surprising to see when the topic actually
exists. It would be better to return NOT_LEADER to avoid confusion. Clients typically handle both errors by refreshing metadata and retrying, so changing this should not cause any change in behavior on the client. This case can be hit following a partition reassignment after the leader is moved and the local replica is deleted.
To validate the current behavior and the fix, I've added integration tests for the fetch and produce APIs.
Start processing client connections only after completing KafkaServer initialization to ensure that credentials are loaded from ZK into cache before authentications are processed. Acceptors are started earlier so that bound port is known for registering in ZK.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
A partially deleted topic can end up with some partitions having no leadership info.
For the partially deleted topic, a new controller should be able to finish the topic deletion
by transitioning the rogue partition's replicas to OfflineReplica state.
This patch adds logic to transition replicas to OfflineReplica state whose partitions have
no leadership info.
Added a new test method to cover the partially deleted topic case.
Reviewers: Jun Rao <junrao@gmail.com>
Removed timeout from get call that caused the test to fail occasionally, this will instead fall back to the wrapping waitUntilTrue timeout. Also added unnesting of exceptions from ExecutionException that was originally missing and put the retrieved value for lowWatermark in the fail message for better readability in case of test failure.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
We had a regression in #4788 which caused the offset commit/fetch/describe APIs to fail if the groupId was empty. This should be allowed for backwards compatibility. Additionally, I have modified DeleteGroups to allow removal of the empty group, which was missed in the initial implementation. I've added a test case to ensure that we do not miss this again in the future.
Reviewers: Ismael Juma <ismael@juma.me.uk>
AsyncProducerTest gets an error about an incorrect mock when the logging
level is turned up. Instead of usIng a mock, just create a real
SyncProducerConfig object, since the object is simple to create.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Move creation of quota callback instance out of KafkaConfig constructor to QuotaFactory.instantiate to avoid creating a callback instance for every KafkaConfig since we create temporary KafkaConfigs during dynamic config updates.
Reviewers: Jun Rao <junrao@gmail.com>
Add a validation check to make sure max.connections.per.ip.overrides is configured when max.connections.per.ip is set zero. Also clean up the config description.
Implementation of KIP-86. Client, server and login callback handlers have been made configurable for both brokers and clients.
Reviewers: Jun Rao <junrao@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
In the group coordinator, we currently check whether the partition is owned before checking whether it is loading. Since loading is a prerequisite for partition ownership, it means that it is not actually possible to see the COORDINATOR_LOAD_IN_PROGRESS error. The impact is mostly harmless: while loading the group, the client may send unnecessary FindCoordinator requests to rediscover the coordinator. I've fixed the bug and restructured the code to enable testing.
In the process of fixing this bug, the following improvements have been made:
1. We now verify valid groupId in all request handlers.
2. Currently if the coordinator is loading when a SyncGroup is received, we'll return NOT_COORDINATOR. I've changed this to return REBALANCE_IN_PROGRESS since the rebalance state will have been lost on coordinator failover. This effectively forces the consumer to rejoin the group, which seems preferable over unnecessarily rediscovering the coordinator.
3. I added a check for the COORDINATOR_LOAD_IN_PROGRESS handler in SyncGroup. Although we do not currently return this error, it seems reasonable that we might want to some day, so it seems better to get the check in now.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Prior to this, there have been instances where invalid data was allowed to be persisted in
ZooKeeper, which causes ClassCastExceptions when a broker is restarted and reads this
type-unsafe data.
Adds basic structural and type validation for the reassignment JSON via
introduction of Scala case classes that map to the expected JSON
structure. Also use the Scala case classes to deserialize the JSON
to avoid duplication.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>, Ismael Juma <ismael@juma.me.uk>