Reviewers: Colin P. McCabe <cmccabe@apache.org>, Viktor Somogyi <viktorsomogyi@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Currently close() only awaits completion of pending produce requests. If there is a transaction ongoing, it may be dropped. For example, if one thread is calling commitTransaction() and another calls close(), then the commit may never happen even if the caller is willing to wait for it (by using a long timeout). What's more, the thread blocking in commitTransaction() will be stuck since the result will not be completed once the producer has shutdown.
This patch ensures that 1) completing transactions are awaited, 2) ongoing transactions are aborted, and 3) pending callbacks are completed before close() returns.
Reviewers: Jason Gustafson <jason@confluent.io>
We have not had great experience with listeners. They make the code harder to understand because they result in indirectly maintained circular dependencies. Often this leads to tricky deadlocks when we try to introduce locking. We were able to remove the Metadata listener in KAFKA-7831. Here we do the same for the listener in SubscriptionState.
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
This patch updates the InitProducerId request API to use the generated sources. It also fixes a small bug in the DescribeAclsRequest class where we were using the wrong api key.
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Colin McCabe <cmccabe@apache.org>
Protocol compatibility can be facilitated if a Struct, that has been defined as an extension of a previous Struct by adding fields at the end of the older version, can read a message of an older version by ignoring the absence of the missing new fields. Reading the missing fields should be allowed by the definition of these fields (they have to be nullable) when supported by the schema.
Reviewers: David Arthur <mumrah@gmail.com>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
Ensure that modification time is checked against the file used to create the SSLContext that is in-use so that SSLContext is updated whenever file is modified and a config update request is received.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
ToString functions must not get a NullPointException. read() functions
must properly translate a negative array length to a null field.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
The goal of this task is to implement an integration test for the kafka stream metrics.
We have to check 2 things:
1. After streams application are started, all metrics from different levels (thread, task, processor, store, cache) are correctly created and displaying recorded values.
2. When streams application are shutdown, all metrics are correctly de-registered and removed.
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
TopicDescription and ConsumerGroupDescription in org.apache.kafka.clients.admin. are part of the public API, so we should retain the existing public constructor. Changed the new constructor with authorized operations to be package-private to avoid maintaining more public constructors since we only expect admin client to use this.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Adds a new listener config `max.connections` to limit the number of active connections on each listener. The config may be prefixed with listener prefix. This limit may be dynamically reconfigured without restarting the broker.
This is one of the PRs for KIP-402 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors). Note that this is currently built on top of PR #6022
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Gwen Shapira <cshapi@gmail.com>
Closes#6034 from rajinisivaram/KAFKA-7730-max-connections
Users have reported (KAFKA-7565) that when consumer poll wake up is used,
it is possible to receive fetch responses that don't match the copied topic
partitions collection for the session when the fetch request was created.
This commit improves the error handling here by throwing an
IllegalStateException instead of a NullPointerException. And by
generating a message for the exception that includes a bit of more
information.
Reviewers: Jason Gustafson <jason@confluent.io>
There is a small timing window where ```time.sleep(retryBackoff)``` will get executed before adminClient adds retry request to the queue. Added a check to wait until the retry call added to the queue in AdminClient.
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#6418 from omkreddy/KAFKA-7939
Metadata may be updated from the background thread, so we need to protect access to SubscriptionState. This patch restructures the metadata handling so that we only check pattern subscriptions in the foreground. Additionally, it improves the following:
1. SubscriptionState is now the source of truth for the topics that will be fetched. We had a lot of messy logic previously to try and keep the the topic set in Metadata consistent with the subscription, so this simplifies the logic.
2. The metadata needs for the producer and consumer are quite different, so it made sense to separate the custom logic into separate extensions of Metadata. For example, only the producer requires topic expiration.
3. We've always had an edge case in which a metadata change with an inflight request may cause us to effectively miss an expected update. This patch implements a separate version inside Metadata which is bumped when the needed topics changes.
4. This patch removes the MetadataListener, which was the cause of https://issues.apache.org/jira/browse/KAFKA-7764.
Reviewers: David Arthur <mumrah@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
In KAFKA-5503, we have added a check for `running` flag in the loop inside maybeWaitForProducerId. This is to handle concurrent call to Sender close(), while we attempt to get the ProducerId. This avoids blocking indefinitely when the producer is shutting down.
This created a corner case, where Sender thread gets blocked, if we had concurrent producerId reset and call to Sender thread close. The fix here is to check the `forceClose` flag in the loop inside maybeWaitForProducerId instead of the `running` flag.
Reviewers: Jason Gustafson <jason@confluent.io>
SelectorTest.testCloseConnectionInClosingState sends and receives messages to get the channel into a state with staged receives and then waits for idle timeout to close the channel. When run with SSL, the channel may have buffered bytes that prevent the channel from being closed. Updated the test to wait until there are no buffered bytes as well.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Test uses 100ms as connections.max.reauth.ms and checks that a second reauthentication doesn't occur within the hard-coded 1 second minimum interval. But since the interval is small, we cannot guarantee that the time between the two checks is not higher than 1 second. Change the test to use MockTime so that we can control the time.
Reviewers: Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
* Fix for KAFKA-7974: Avoid calling disconnect() when not connecting
* Resolve host only when currentAddress() is called
Moves away from automatically resolving the host when the connection entry is constructed, which can leave ClusterConnectionStates in a confused state.
Instead, resolution is done on demand, ensuring that the entry in the connection list is present even if the resolution failed.
* Add Javadoc to ClusterConnectionStates.connecting()
Refactors the various maps used in TransactionManager into one map to simplify bookkeeping of inflight batches, offsets and sequence numbers.
Reviewers: Jason Gustafson <jason@confluent.io>
* KAFKA-7962: StickyAssignor: throws NullPointerException during assignments if topic is deleted
https://issues.apache.org/jira/browse/KAFKA-7962
Consumer using StickyAssignor throws NullPointerException if a subscribed topic was removed.
* addressed vahidhashemian's comments
* lower NPath Complexity
* added a unit test
Whenever the consumer coordinator sends a response that doesn't match the client consumer subscription, we should check the subscription to see if it has changed. If it has, we can ignore the assignment and request a rebalance. Otherwise, we can throw an exception as before.
Testing strategy: create a mocked client that first sends an assignment response that doesn't match the client subscription followed by an assignment response that does match the client subscription.
Reviewers: Jason Gustafson <jason@confluent.io>
Currently, commitTransaction and abortTransaction wait indefinitely for the respective operation to be completed. This patch uses the producer's max block time to limit the time that we will wait. If the timeout elapses, we raise a TimeoutException, which allows the user to either close the producer or retry the operation.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
Use of `MetadataRequest.isAllTopics` is not consistently defined for all versions of the api. For v0, it evaluates to false. This patch makes the behavior consistent for all versions.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Since we are logging offset resets and such at info level, it makes sense to use the same level for subscriptions and assignments.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Fail produce requests using zstd until the inter.broker.protocol.version is large enough that replicas are ensured to support it. Otherwise, followers receive the `UNSUPPORTED_COMPRESSION_TYPE` when fetching zstd data and ISRs shrink.
Reviewers: Jason Gustafson <jason@confluent.io>
Fix the following situations, where pending members (one that has a member-id, but hasn't joined the group) can cause rebalance operations to fail:
- In AbstractCoordinator, a pending consumer should be allowed to leave.
- A rebalance operation must successfully complete if a pending member either joins or times out.
- During a rebalance operation, a pending member must be able to leave a group.
Reviewers: Boyang Chen <bchen11@outlook.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
When debugging KafkaConsumer production issues, it's pretty
useful to have log entries related to seeking and committed
offset retrieval enabled by default. These are currently present,
but only when debug logging is enabled. Change them to `info`.
Also included a minor code simplication and a slight improvement
to an exception message.
Reviewers: Jason Gustafson <jason@confluent.io>
It used to preallocate an array of responses and then complete each response from the original collection sequentially. The problem was that the original collection could have been modified (another thread completing the response) while this was hapenning