Do not allow an empty replica set to be passed into the reassignment API.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@gmail.com>
For scenarios where the incoming traffic of all input partitions are small, there's a pitfall that the enforced processing timer is not reset after we have enforce processed ALL records. The fix itself is pretty simple: we just reset the timer when there's no buffered records.
Reviewers: Javier Holguera <javier.holguera@gmail.com>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>
After many runs of reproducing the failure (on my local MP5 it takes about 100 - 200 run to get one) I think it is more likely a flaky one and not exposing a real bug in rebalance protocol.
What I've observed is that, when the verifying consumer is trying to fetch from the output topics (there are 11 of them), it poll(1sec) each time, and retries 30 times if there's no more data to fetch and stop. It means that if there are no data fetched from the output topics for 30 * 1 = 30 seconds then the verification would stop (potentially too early). And for the failure cases, we observe consistent rebalancing among the closing / newly created clients since the closing is async, i.e. while new clients are added it is possible that closing clients triggered rebalance are not completed yet (note that each instance is configured with 3 threads, and in the worst case there are 6 instances running / pending shutdown at the same time, so a group fo 3 * 6 = 18 members is possible).
However, there's still a possible bug that in KIP-429, somehow the rebalance can never stabilize and members keep re-rejoining and hence cause it to fail. We have another unit test that have bumped up to 3 rebalance by @ableegoldman and if that failed again then it may be a better confirmation such bug may exist.
So what I've done so far for this test:
1. When closing a client, wait for it to complete closure before moving on to the next iteration and starting a new instance to reduce the rebalance churns.
2. Poll for 5 seconds instead of 1 to wait for longer time: 5 * 30 = 150 seconds, and locally my laptop finished this test in about 50 seconds.
3. Minor debug logging improvement; in fact some of them is to reduce redundant debug logging since it is too long and sometimes hides the key information.
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>
* Made commit-over-tasks sensor and skipped-records sensor optional since they are removed in the latest version
* Refactored methods for sensor creation
* Adapted unit and integration tests
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Aims to fix the flaky LogCleanerIntegrationTest#testIsThreadFailed by changing how metrics are cleaned.
Reviewers: Jason Gustafson <jason@confluent.io>
Allow routing of `AdminClient#describeTopics` to any broker in the cluster than just the controller, so that we don't create a hotspot for this API call. `AdminClient#describeTopics` uses the broker's metadata cache which is asynchronously maintained, so routing to brokers other than the controller is not expected to have a significant difference in terms of metadata consistency; all metadata requests are eventually consistent.
This patch also fixes a few flaky test failures.
Reviewers: Ismael Juma <ismael@juma.me.uk>, José Armando García Sancio <jsancio@gmail.com>, Jason Gustafson <jason@confluent.io>
KIP-352 aims to add several new metrics in order to track reassignments much better. We will be able to measure bytes in/out rate and the count of partitions under active reassignment.
We also change the semantic of the UnderReplicatedPartitions metric to cater better for reassignment. Currently it reports under-replicated partitions when during reassignment extra partitions are added as part of the process but this PR changes it so it'll always take the original replica set into account when computing the URP metrics.
The newly added metrics will be:
- kafka.server:type=ReplicaManager,name=ReassigningPartitions
- kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesOutPerSec
- kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesInPerSec
The changed URP metric:
- kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Jason Gustafson <jason@confluent.io>
It includes an important fix for people running on k8s:
* ZOOKEEPER-3320: Leader election port stop listen when
hostname unresolvable for some time
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
With KIP-392, we allow consumers to fetch from followers. This capability is enabled when a replica selector has been provided in the configuration. When not in use, the intent is to preserve current behavior of fetching only from leader. The leader epoch is the mechanism that keeps us honest. When there is a leader change, the epoch gets bumped, consumer fetches fail due to the fenced epoch, and we find the new leader.
However, for old consumers, there is no similar protection. The leader epoch was not available to clients until recently. If there is a preferred leader election (for example), the old consumer will happily continue fetching from the demoted leader until a periodic metadata fetch causes us to discover the new leader. This does not create any problems from a correctness perspective–fetches are still bound by the high watermark–but it is unexpected and may cause unexpected performance characteristics.
This patch fixes this problem by enforcing leader-only fetching for older versions of the fetch request.
Reviewers: Jason Gustafson <jason@confluent.io>
AbstractRequestResponse should be an interface, since it has no concrete elements or implementation. Move AbstractRequestResponse#serialize to RequestUtils#serialize and make it package-private, since it doesn't need to be public.
Reviewers: Ismael Juma <ismael@juma.me.uk>
The KafkaConsumer Fetcher can sometimes get into an invalid state where it believes that there are ongoing fetch requests, but in fact there are none. This may be caused by the heartbeat thread concurrently handling a disconnection event just after the fetcher thread submits a request which would cause the Fetcher to enter an invalid state where it believes it has ongoing requests to the disconnected node but in fact it does not. This is due to a thread safety issue in the Fetcher where it was possible for the ordering of the modifications to the nodesWithPendingFetchRequests to be incorrect - the Fetcher was adding it after the listener had already been invoked, which would mean that pending node never gets removed again.
This PR addresses that thread safety issue by ensuring that the pending node is added to the nodesWithPendingFetchRequests before the listener is added to the future, ensuring the finally block is called after the node is added.
Reviewers: Tom Lee, Jason Gustafson <jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
This PR contains the fix of race condition bug between "consumer thread" and "consumer coordinator heartbeat thread". It reproduces in many production environments.
Condition for reproducing:
1. Consumer thread initiates rejoin to the group because of commit timeout. Call of AbstractCoordinator#joinGroupIfNeeded which leads to sendJoinGroupRequest.
2. JoinGroupResponseHandler writes to the AbstractCoordinator.this.generation new generation data and leaves the synchronized section.
3. Heartbeat thread executes mabeLeaveGroup and clears generation data via resetGenerationOnLeaveGroup.
4. Consumer thread executes onJoinComplete(generation.generationId, generation.memberId, generation.protocol, memberAssignment); with the cleared generation data. This leads to the corresponding exception.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Migrate this integration test to use TopologyTestDriver instead of running 3 Streams instances.
Dropped one test that was attempting to produce specific interleavings. If anything, these should be verified deterministically by unit testing.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Also put in some additional logging that makes sense to add, and proved helpful in debugging this particular issue.
Unit tests verifying the encoded supported version were added.
This should get cherry-picked back to 2.1
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Fix the formatting and wording of the foreign-key join javadoc
Optimize handling of null extracted foreign keys
Reviewers: Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed.
KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues.
Author: Greg Harris <gregh@confluent.io>
Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
Document the upgrade path for the consumer and for Streams (note that they differ significantly).
Needs to be cherry-picked to 2.4
Reviewers: Guozhang Wang <wangguoz@gmail.com>
In a KTable context, null record values have a special "tombstone" significance. We should always bypass the serdes for such tombstones, since otherwise the serde could violate Streams' table semantics.
Added test coverage for this case and fixed the code accordingly.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bill@confluent.io>
Fix bug in Connect REST extension API caused by invalid constructor parameter validation, and update integration test to play nicely with Jenkins
Fix instantiation of TaskState objects by Connect framework.
Author: Chris Egerton <chrise@confluent.io>
Reviewers: Magesh Nandakumar <mageshn@confluent.io>, Randall Hauch <rhauch@gmail.com>
By default, if the user does not configure a `client.id`, then we use a very generic identifier, such as `consumer-15`. It is more useful to include identifying information when available such as `group.id` for the consumer and `transactional.id` for the producer.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
I looked into the logs of the above tickets, and I think for a couple fo them it is due to the fact that the threads takes time to restore, or just stabilize the rebalance since there are multi-threads. Adding the hook to wait for state to transit to RUNNING upon starting.
Reviewers: Chris Pettitt <cpettitt@confluent.io>, Matthias J. Sax <matthias@confluent.io>
A partition log in initialized in following steps:
1. Fetch log config from ZK
2. Call LogManager.getOrCreateLog which creates the Log object, then
3. Registers the Log object
Step #3 enables Configuration update thread to deliver configuration
updates to the log. But if any update arrives between step #1 and #3
then that update is missed. It breaks following use case:
1. Create a topic with default configuration, and immediately after that
2. Update the configuration of topic
There is a race condition here and in random cases update made in
second step will get dropped.
This change fixes it by tracking updates arriving between step #1 and #3
Once a Partition is done initializing log, it checks if it has missed any
update. If yes, then the configuration is read from ZK again.
Added unit tests to make sure a dirty configuration is refreshed. Tested
on local cluster to make sure that topic configuration and updates are
handled correctly.
Reviewers: Jason Gustafson <jason@confluent.io>
Author: Bruno Cadonna <bruno@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#7490 from cadonna/AK8942-docs-rocksdb_metrics
Minor comments