* Fix for KAFKA-7974: Avoid calling disconnect() when not connecting
* Resolve host only when currentAddress() is called
Moves away from automatically resolving the host when the connection entry is constructed, which can leave ClusterConnectionStates in a confused state.
Instead, resolution is done on demand, ensuring that the entry in the connection list is present even if the resolution failed.
* Add Javadoc to ClusterConnectionStates.connecting()
Refactors the various maps used in TransactionManager into one map to simplify bookkeeping of inflight batches, offsets and sequence numbers.
Reviewers: Jason Gustafson <jason@confluent.io>
* KAFKA-7962: StickyAssignor: throws NullPointerException during assignments if topic is deleted
https://issues.apache.org/jira/browse/KAFKA-7962
Consumer using StickyAssignor throws NullPointerException if a subscribed topic was removed.
* addressed vahidhashemian's comments
* lower NPath Complexity
* added a unit test
Whenever the consumer coordinator sends a response that doesn't match the client consumer subscription, we should check the subscription to see if it has changed. If it has, we can ignore the assignment and request a rebalance. Otherwise, we can throw an exception as before.
Testing strategy: create a mocked client that first sends an assignment response that doesn't match the client subscription followed by an assignment response that does match the client subscription.
Reviewers: Jason Gustafson <jason@confluent.io>
Currently, commitTransaction and abortTransaction wait indefinitely for the respective operation to be completed. This patch uses the producer's max block time to limit the time that we will wait. If the timeout elapses, we raise a TimeoutException, which allows the user to either close the producer or retry the operation.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Use of `MetadataRequest.isAllTopics` is not consistently defined for all versions of the api. For v0, it evaluates to false. This patch makes the behavior consistent for all versions.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Since we are logging offset resets and such at info level, it makes sense to use the same level for subscriptions and assignments.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Fail produce requests using zstd until the inter.broker.protocol.version is large enough that replicas are ensured to support it. Otherwise, followers receive the `UNSUPPORTED_COMPRESSION_TYPE` when fetching zstd data and ISRs shrink.
Reviewers: Jason Gustafson <jason@confluent.io>
Fix the following situations, where pending members (one that has a member-id, but hasn't joined the group) can cause rebalance operations to fail:
- In AbstractCoordinator, a pending consumer should be allowed to leave.
- A rebalance operation must successfully complete if a pending member either joins or times out.
- During a rebalance operation, a pending member must be able to leave a group.
Reviewers: Boyang Chen <bchen11@outlook.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
When debugging KafkaConsumer production issues, it's pretty
useful to have log entries related to seeking and committed
offset retrieval enabled by default. These are currently present,
but only when debug logging is enabled. Change them to `info`.
Also included a minor code simplication and a slight improvement
to an exception message.
Reviewers: Jason Gustafson <jason@confluent.io>
It used to preallocate an array of responses and then complete each response from the original collection sequentially. The problem was that the original collection could have been modified (another thread completing the response) while this was hapenning
Avoid unnecessary lock acquire when KafkaConsumer commits offsets.
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
As part of KIP-320, we should add the expected leader epoch to Fetch and ListOffsets requests. This will allow us ultimately to detect log truncation.
Reviewers: Jason Gustafson <jason@confluent.io>
- Include more detail in the client log message if the disconnection happens
during authentication.
- Include exception message in the Selector info entry when authentication
fails and unwrap `DelayedResponseAuthenticationException`.
- Remove duplicate debug log on authentication failure in the Selector.
Empty username or password would result in the "expected 3 tokens"
error instead of "username not specified" or "password not specified".
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
JUnit 4.13 fixes the issue where `Category` and `Parameterized` annotations
could not be used together. It also deprecates `ExpectedException` and
`assertThat`. Given this, we:
- Replace `ExpectedException` with the newly introduced `assertThrows`.
- Replace `Assert.assertThat` with `MatcherAssert.assertThat`.
- Annotate `AbstractLogCleanerIntegrationTest` with `IntegrationTest` category.
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, David Arthur <mumrah@gmail.com>
Don't return error messages from `SaslException` to clients. Error messages to be returned to clients to aid debugging must be thrown as AuthenticationExceptions. This is a fix for a regression from KAFKA-7352.
Reviewers: Ron Dagostino <rndgstn@gmail.com>, Ismael Juma <ismael@juma.me.uk
This helps narrow down the specific broker they came from when debugging
ACL propagation issues.
Reviewers: Vahid Hashemian <vahid.hashemian@gmail.com>, Ismael Juma <ismael@juma.me.uk>
When an older message format is in use, we should disable the leader epoch cache so that we resort to truncation by high watermark. Previously we updated the cache for all versions when a broker became leader for a partition. This can cause large and unnecessary truncations after leader changes because we relied on the presence of _any_ cached epoch in order to tell whether to use the improved truncation logic possible with the OffsetsForLeaderEpoch API.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Jun Rao <junrao@gmail.com>
Replaces original loginContext if login fails in the refresh thread to ensure that the refresh thread is left in a clean state when there are exceptions while connecting to an OAuth server. Also makes client callback handler more robust by using the token with the longest remaining time for expiry instead of throwing an exception if multiple tokens are found.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
This patch introduces a new config - "group.max.size", which caps the maximum size any group can reach. It has a default value of Int.MAX_VALUE. Once a group is of the maximum size, subsequent JoinGroup requests receive a MAX_SIZE_REACHED error.
In the case where the config is changed and a Coordinator broker with the new config loads an old group that is over the threshold, members are kicked out of the group and a rebalance is forced.
Reviewers: Vahid Hashemian <vahid.hashemian@gmail.com>, Boyang Chen <bchen11@outlook.com>, Gwen Shapira <cshapi@gmail.com>, Jason Gustafson <jason@confluent.io>
See also KIP-183.
This implements the following algorithm:
AdminClient sends ElectPreferredLeadersRequest.
KafakApis receives ElectPreferredLeadersRequest and delegates to
ReplicaManager.electPreferredLeaders()
ReplicaManager delegates to KafkaController.electPreferredLeaders()
KafkaController adds a PreferredReplicaLeaderElection to the EventManager,
ReplicaManager.electPreferredLeaders()'s callback uses the
delayedElectPreferredReplicasPurgatory to wait for the results of the
election to appear in the metadata cache. If there are no results
because of errors, or because the preferred leaders are already leading
the partitions then a response is returned immediately.
In the EventManager work thread the preferred leader is elected as follows:
The EventManager runs PreferredReplicaLeaderElection.process()
process() calls KafkaController.onPreferredReplicaElectionWithResults()
KafkaController.onPreferredReplicaElectionWithResults()
calls the PartitionStateMachine.handleStateChangesWithResults() to
perform the election (asynchronously the PSM will send LeaderAndIsrRequest
to the new and old leaders and UpdateMetadataRequest to all brokers)
then invokes the callback.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jun Rao <junrao@gmail.com>
The problem is that the sequence number is an Int and should wrap around when it reaches the Int.MaxValue. The bug here is it doesn't wrap around and become negative and raises an error.
Reviewers: Jason Gustafson <jason@confluent.io>
This patch fixes a few overflow issues with wrapping sequence numbers in the broker's producer state tracking.
Reviewers: Jason Gustafson <jason@confluent.io>
The presence of the buildSrc subproject is causing problems when we try
to run installAll, jarAll, and the other "all" targets. It's easier
just to make the generator code a regular subproject and use the
JavaExec gradle task to run the code. This also makes it more
straightforward to run the generator unit tests.
Reviewers: David Arthur <mumrah@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Colin P. Mccabe <cmccabe@confluent.io>
Co-authored-by: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
* Join ssl principal mapping rules correctly before evaluating.
Java properties splits the configuration array on commas, and that leads to rules containing commas being split before being evaluated. This commit adds a code change to re-join those strings into full rules before evaluating them. The function assumes every rule is either DEFAULT or begins with the prefix RULE:
The Javadoc is using Properties.put which should never be used because it allows putting non-strings into a Properties object which is designed to only handle strings.
Two other minor fixes so the examples actually work
Track the last seen partition epoch in the Metadata class. When handling metadata updates, check that the partition info being received is for the last seen epoch or a newer one. This prevents stale metadata from being loaded into the client.
Reviewers: Jason Gustafson <jason@confluent.io>
This patch adds a framework to automatically generate the request/response classes for Kafka's protocol. The code will be updated to use the generated classes in follow-up patches. Below is a brief summary of the included components:
**buildSrc/src**
The message generator code is here. This code is automatically re-run by gradle when one of the schema files changes. The entire directory is processed at once to minimize the number of times we have to start a new JVM. We use Jackson to translate the JSON files into Java objects.
**clients/src/main/java/org/apache/kafka/common/protocol/Message.java**
This is the interface implemented by all automatically generated messages.
**clients/src/main/java/org/apache/kafka/common/protocol/MessageUtil.java**
Some utility functions used by the generated message code.
**clients/src/main/java/org/apache/kafka/common/protocol/Readable.java, Writable.java, ByteBufferAccessor.java**
The generated message code uses these classes for writing to a buffer.
**clients/src/main/message/README.md**
This README file explains how the JSON schemas work.
**clients/src/main/message/\*.json**
The JSON files in this directory implement every supported version of every Kafka API. The unit tests automatically validate that the generated schemas match the hand-written schemas in our code. Additionally, there are some things like request and response headers that have schemas here.
**clients/src/main/java/org/apache/kafka/common/utils/ImplicitLinkedHashSet.java**
I added an optimization here for empty sets. This is useful here because I want all messages to start with empty sets by default prior to being loaded with data. This is similar to the "empty list" optimizations in the `java.util.ArrayList` class.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>, Bob Barrett <bob.barrett@outlook.com>, Jason Gustafson <jason@confluent.io>
* Update KafkaAdminClient#describeTopics to throw UnknownTopicOrPartitionException.
* Remove unused method: WorkerUtils#getMatchingTopicPartitions.
* Add some JavaDoc.
Reviewed-by: Colin P. McCabe <cmccabe@apache.org>, Ryanne Dolan <ryannedolan@gmail.com>