src-kafka

Commit Graph

Author	SHA1	Message	Date
Boyang Chen	03d61ebfb9	KAFKA-8569: integrate warning message under static membership (#6972 ) Static members never leave the group, so potentially we could log a flooding number of warning messages in the hb thread. The solution is to only log as warning when we are on dynamic membership. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
wenhoujx	93bf965894	KAFKA-8559: Allocate ArrayList with correct size in PartitionStates (#6964 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Boyang Chen	c7db82b59a	MINOR: rename subscription construction function (#6954 ) Per discussion on #6936, some nit fixes to the Subscription initialization path. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	1ae92914e2	HOTFIX: Fix optional import in ConsumerCoordinator (#6953 ) This was caused by back-to-back merging of #6854 (which removed the Optional import) and #6936 (which needed the import). Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Boyang Chen	47f908fa73	KAFKA-8539; Add group.instance.id to Subscription (#6936 ) This PR is part of KIP-345's effort to utilize this new field for more stable topic partition assignment. We add the group instance id to the `Subscription` object to allow partition assignors to make stickier assignments. More details [here](https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances#KIP-345:Introducestaticmembershipprotocoltoreduceconsumerrebalances-ClientBehaviorChanges). Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Boyang Chen	1b9e107388	KAFKA-7853: Refactor coordinator config (#6854 ) An attempt to refactor current coordinator logic. Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Colin P. Mccabe	e047864f30	MINOR: fix some warnings in the broker Author: Colin P. Mccabe <cmccabe@confluent.io> Reviewers: Gwen Shapira Closes #6942 from cmccabe/fix-scala-warnings	6 years ago
Guozhang Wang	2ef02f111e	KAFKA-8179: Part I, Bump up consumer protocol to v2 (#6528 ) 1. Add new fields of subscription / assignment and bump up consumer protocol to v2. 2. Update tests to make sure old versioned protocol can be successfully deserialized, and new versioned protocol can be deserialized by old byte code. Reviewers: Boyang Chen <boyang@confluent.io>, Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bbejeck@gmail.com>	6 years ago
wenhoujx	35814298e1	KAFKA-8488: Reduce logging-related string allocation in FetchSessionHandler Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>	6 years ago
Jason Gustafson	8dd4fb5ebe	KAFKA-8530; Check for topic authorization errors in OffsetFetch response (#6928 ) The OffsetFetch requires Topic Describe permission. If a client does not have this, we return TOPIC_AUTHORIZATION_FAILED at the partition level. Currently the consumer does not handle this error explicitly, but raises it as a generic `KafkaException`. For consistency with other APIs and to fix transient test failures in `PlaintextEndToEndAuthorizationTest`, we should raise `TopicAuthorizationFailedException` instead. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Jason Gustafson	af2801031c	KAFKA-8483/KAFKA-8484; Ensure safe handling of producerId resets (#6883 ) The idempotent producer attempts to detect spurious UNKNOWN_PRODUCER_ID errors and handle them by reassigning sequence numbers to the inflight batches. The inflight batches are tracked in a PriorityQueue. The problem is that the reassignment of sequence numbers depends on the iteration order of PriorityQueue, which does not guarantee any ordering. So this can result in sequence numbers being assigned in the wrong order. This patch fixes the problem by using a sorted set instead of a priority queue so that the iteration order preserves the sequence order. Note that resetting sequence numbers is an exceptional case. This patch also fixes KAFKA-8484, which can cause an IllegalStateException when the producerId is reset while there are pending produce requests inflight. The solution is to ensure that sequence numbers are only reset if the producerId of a failed batch corresponds to the current producerId. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
highluck	e2c15e0eeb	MINOR: Remove uncommitted code (#6919 )	6 years ago
Guozhang Wang	bebcbe3a04	KAFKA-8487: Only request re-join on REBALANCE_IN_PROGRESS in CommitOffsetResponse (#6894 ) Plus some minor cleanups on AbstractCoordinator. Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	cca05cace4	KAFKA-8331: stream static membership system test (#6877 ) As title suggested, we boost 3 stream instances stream job with one minute session timeout, and once the group is stable, doing couple of rolling bounces for the entire cluster. Every rejoin based on restart should have no generation bump on the client side. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>	6 years ago
Almog Gavra	8e161580b8	KAFKA-8305; Support default partitions & replication factor in AdminClient#createTopic (KIP-464) (#6728 ) This commit makes three changes: - Adds a constructor for NewTopic(String, Optional<Integer>, Optional<Short>) which allows users to specify Optional.empty() for numPartitions or replicationFactor in order to use the broker default. - Changes AdminManager to accept -1 as valid options for replication factor and numPartitions (resolving to broker defaults). - Makes --partitions and --replication-factor optional arguments when creating topics using kafka-topics.sh. - Adds a dependency on scalaJava8Compat library to make it simpler to convert Scala Option to Java Optional Reviewers: Ismael Juma <ismael@juma.me.uk>, Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
David Arthur	264d1d8a8b	Improve logging in the consumer for epoch updates (#6879 )	6 years ago
Boyang Chen	055c9c7bd6	KAFKA 8311: better handle timeout exception on Stream thread (#6662 ) The goals for this small diff are: 1. Give user guidance if they want to relax commit timeout threshold 2. Indicate the code path where timeout exception was caught Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	6 years ago
Guozhang Wang	573152dfa8	HOTFIX: Allow multi-batches for old format and no compression (#6871 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Randall Hauch	ce008e72de	KAFKA-8475: Temporarily restore SslFactory.sslContext() helper Temporarily restore the SslFactory.sslContext() function, which some connectors use. This function is not a public API and it will be removed eventually. For now, we will mark it as deprecated.	6 years ago
tadsul	b042b36674	KAFKA-8426; Fix for keeping the ConfigProvider configs consistent with KIP-297 (#6750 ) According to KIP-297 a parameter is passed to ConfigProvider with syntax "config.providers.{name}.param.{param-name}". Currently AbstractConfig allows parameters of the format "config.providers.{name}.{param-name}". With this fix AbstractConfig will be consistent with KIP-297 syntax. Reviewers: Robert Yokota <rayokota@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
tadsul	2c810e4afb	KAFKA-8425: Fix for correctly handling immutable maps (KIP-421 bug) (#6795 ) Since the originals map passed to AbstractConfig constructor may be immutable, avoid updating this map while resolving indirect config variables. Instead a new ResolvingMap instance is now used to store resolved configs. Reviewers: Randall Hauch <rhauch@gmail.com>, Boyang Chen <bchen11@outlook.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Lifei Chen	5795675599	MINOR:Replace duplicated code with common function in utils (#6819 ) Reviewers: Ivan Yurchenko <ivanyu@aiven.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Jason Gustafson	fd9a20e416	KAFKA-8429; Handle offset change when OffsetForLeaderEpoch inflight (#6811 ) It is possible for the offset of a partition to be changed while we are in the middle of validation. If the OffsetForLeaderEpoch request is in-flight and the offset changes, we need to redo the validation after it returns. We had a check for this situation previously, but it was only checking if the current leader epoch had changed. This patch fixes this and moves the validation in `SubscriptionState` where it can be protected with a lock. Additionally, this patch adds test cases for the SubscriptionState validation API. We fix a small bug handling broker downgrades. Basically we should skip validation if the latest metadata does not include leader epoch information. Reviewers: David Arthur <mumrah@gmail.com>	6 years ago
Viktor Somogyi	e82e2e723a	KAFKA-7703; position() may return a wrong offset after seekToEnd (#6407 ) When poll is called which resets the offsets to the beginning, followed by a seekToEnd and a position, it could happen that the "reset to earliest" call in poll overrides the "reset to latest" initiated by seekToEnd in a very delicate way: 1. both request has been issued and returned to the client side (listOffsetResponse has happened) 2. in Fetcher.resetOffsetIfNeeded(TopicPartition, Long, OffsetData) the thread scheduler could prefer the heartbeat thread with the "reset to earliest" call, overriding the offset to the earliest and setting the SubscriptionState with that position. 3. The thread scheduler continues execution of the thread (application thread) with the "reset to latest" call and discards it as the "reset to earliest" already set the position - the wrong one. 4. The blocking position call returns with the earliest offset instead of the latest, despite it wasn't expected. The fix makes SubscriptionState synchronized so that we can verify that the reset is expected while holding the lock. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
José Armando García Sancio	121308cc7a	KAFKA-8286; Generalized Leader Election Admin RPC (KIP-460) (#6686 ) Implements KIP-460: https://cwiki.apache.org/confluence/display/KAFKA/KIP-460%3A+Admin+Leader+Election+RPC. Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	051379ea5d	KAFKA-8430: unit test to make sure null `group.id` and valid `group.instance.id` are valid combo (#6830 ) As title suggests, this unit test is just a double check. No need to push in 2.3 Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>	6 years ago
Boyang Chen	901eb36883	MINOR: Set default `group.instance.id` in JoinGroupResponse to null (#6831 ) As we are planning to add on more supporting features for rebalancing under static membership, we need to make sure the behavior for `group.instance.id` is consistent throughout the whole stack. This patch ensures that the default value is null in the JoinGroup response. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Colin Patrick McCabe	24f664aa16	MINOR: Auth operations must be null when talking to a pre-KIP-430 broker (#6812 ) Authorized operations must be null when talking to a pre-KIP-430 broker. If we present this as the empty set instead, it is impossible for clients to know if they have no permissions, or are talking to an old broker. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	6 years ago
Jason Gustafson	e6057e5038	KAFKA-8437; Await node api versions before checking if offset validation is possible (#6823 ) The consumer should await api version information before determining whether the broker supports offset validation. In KAFKA-8422, we skip the validation if we don't have api version information, which means we always skip validation the first time we connect to a node. This bug was detected by the failing system test `tests/client/truncation_test.py`. The test passes again with this fix. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Jason Gustafson	a1808962e5	KAFKA-8422; Client should send OffsetForLeaderEpoch only if broker supports latest version (#6806 ) In the olden days, OffsetForLeaderEpoch was exclusively an inter-broker protocol and required Cluster level permission. With KIP-320, clients can use this API as well and so we lowered the required permission to Topic Describe. The only way the client can be sure that the new permissions are in use is to require version 3 of the protocol which was bumped for 2.3. If the broker does not support this version, we skip the validation and revert to the old behavior. Additionally, this patch fixes a problem with the newly added replicaId field when parsed from older versions which did not have it. If the field was not present, then we used the consumer's sentinel value, but this would limit the range of visible offsets by the high watermark. To get around this problem, this patch adds a separate "debug" sentinel similar to APIs like Fetch and ListOffsets. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
soondenana	46a02f3231	KAFKA-8341. Retry Consumer group operation for NOT_COORDINATOR error (#6723 ) An API call for consumer groups must send a FindCoordinatorRequest to find the consumer group coordinator, and then send a follow-up request to that node. But the coordinator might move after the FindCoordinatorRequest but before the follow-up request is sent. In that case we currently fail. This change fixes that by detecting this error and then retrying. This fixes listConsumerGroupOffsets, deleteConsumerGroups, and describeConsumerGroups. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Boyang Chen <bchen11@outlook.com>	6 years ago
Guozhang Wang	4574b2438a	MINOR: Remove checking on original joined subscription within handleAssignmentMismatch (#6782 ) When consumer coordinator realize the subscription may have changed, today we check again against the joinedSubscription within handleAssignmentMismatch. This checking however is a bit fishy and over-kill as well. It's better just simplifying it to always request re-join. The joinedSubscription object itself however still need to be maintained for potential augment to avoid extra re-joining the group. Since testOutdatedCoordinatorAssignment already cover the normal case we also remove the other invalidAssignment test case. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	cafdc1e7df	KAFKA-8399: bring back internal.leave.group.on.close config for KStream (#6779 ) As title states. We plan to merge this to both trunk and 2.3 if it could fix the stream system tests globally. Reference implementation: #6673 Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>	6 years ago
Jason Gustafson	4f11090597	HOTFIX: Fix recent protocol breakage from KIP-345 and KIP-392 (#6780 ) KIP-345 and KIP-392 introduced a couple breaking changes for old versions of bumped protocols. This patch fixes them. Reviewers: Colin Patrick McCabe <cmccabe@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Boyang Chen <bchen11@outlook.com>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
David Arthur	bacb45e044	MINOR: Set `replicaId` for OffsetsForLeaderEpoch from followers (#6775 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Rajini Sivaram	012880d424	KAFKA-8052; Ensure fetch session epoch is updated before new request (#6582 ) Reviewers: Jason Gustafson <jason@confluent.io>, Colin Patrick McCabe <cmccabe@confluent.io>, Andrew Olson <aolson1@cerner.com>, José Armando García Sancio <jsancio@users.noreply.github.com>	6 years ago
Magesh Nandakumar	7d70133b75	KAFKA-8265: Fix config name to match KIP-458. (#6755 ) Return a copy of the ConfigDef in Client Configs. Related to KIP-458. Author: Magesh Nandakumar <magesh.n.kumar@gmail.com Reviewer: Randall Hauch <rhauch@gmail.com>	6 years ago
Rajini Sivaram	614ea55ad7	KAFKA-8381; Disable hostname validation when verifying inter-broker SSL (#6757 ) - Make endpoint validation configurable on SslEngineBuilder when creating an engine - Disable endpoint validation for engines created for inter-broker SSL validation since it is unsafe to use `localhost` - Use empty hostname in validation engine to ensure tests fail if validation is re-enabled by mistake - Add tests to verify inter-broker SSL validation Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	6 years ago
Stanislav Kozlovski	5a30a806ec	MINOR: Add log when the consumer does not send an offset commit due to not being part of an active group (#6404 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	e00c0d316d	MINOR: Fix typo in heartbeat request protocol definition (#6759 ) This changes the field "generationid" to "generationId" to be consistent with other uses. Reviewers: Shaobo Liu <lambda.tencent@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	9fa331b811	KAFKA-8225 & KIP-345 part-2: fencing static member instances with conflicting group.instance.id (#6650 ) For static members join/rejoin, we encode the current timestamp in the new member.id. The format looks like group.instance.id-timestamp. During consumer/broker interaction logic (Join, Sync, Heartbeat, Commit), we shall check the whether group.instance.id is known on group. If yes, we shall match the member.id stored on static membership map with the request member.id. If mismatching, this indicates a conflict consumer has used same group.instance.id, and it will receive a fatal exception to shut down. Right now the only missing part is the system test. Will work on it offline while getting the major logic changes reviewed. Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
David Arthur	e2847e8603	KAFKA-8365; Consumer and protocol support for follower fetching (#6731 ) This patch includes API changes for follower fetching per [KIP-392](https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica) as well as the consumer implementation. After this patch, consumers will continue to fetch only from the leader, since the broker implementation to select an alternate read replica is not included here. Adds new `client.rack` consumer configuration property is added which allows the consumer to indicate its rack. This is just an arbitrary string to indicate some relative location, it doesn't have to actually represent a physical rack. We are keeping the naming consistent with the broker property (`broker.rack`). FetchRequest now includes `rack_id` which can optionally be specified by the consumer. FetchResponse includes an optional `preferred_read_replica` field for each partition in the response. OffsetForLeaderEpochRequest also adds new `replica_id` field which is similar to the same field in FetchRequest. When the consumer sees a `preferred_read_replica` in a fetch response, it will use the Node with that ID for the next fetch. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Shaobo Liu	64c2d49cf5	MINOR: Add test for ConsumerNetworkClient.trySend (#6739 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Rajini Sivaram	8de7d37724	KAFKA-8379; Fix KafkaAdminClientTest.testUnreachableBootstrapServer (#6753 ) Initiate `unreachable server` scenario before starting admin client to avoid timing issues if node is disconnected from the test thread while admin client network thread is processing a metadata request. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Magesh Nandakumar	2e91a310d7	KAFKA-8265: Initial implementation for ConnectorClientConfigPolicy to enable overrides (KIP-458) (#6624 ) Implementation to enable policy for Connector Client config overrides. This is implemented per the KIP-458. Reviewers: Randall Hauch <rhauch@gmail.com>	6 years ago
Konstantine Karantasis	ce584a01ff	KAFKA-5505: Incremental cooperative rebalancing in Connect (KIP-415) (#6363 ) Added the incremental cooperative rebalancing in Connect to avoid global rebalances on all connectors and tasks with each new/changed/removed connector. This new protocol is backward compatible and will work with heterogeneous clusters that exist during a rolling upgrade, but once the clusters consist of new workers only some affected connectors and tasks will be rebalanced: connectors and tasks on existing nodes still in the cluster and not added/changed/removed will continue running while the affected connectors and tasks are rebalanced. This commit attempted to minimize the changes to the existing V0 protocol logic, though that was not entirely possible. This commit adds extensive unit and integration tests for both the old V0 protocol and the new v1 protocol. Soak testing has been performed multiple times to verify behavior while connectors and added, changed, and removed and while workers are added and removed from the cluster. Author: Konstantine Karantasis <konstantine@confluent.io> Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <me@ewencp.org>, Robert Yokota <rayokota@gmail.com>, David Arthur <mumrah@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>	6 years ago
Jason Gustafson	26814e060e	KAFKA-8376; Least loaded node should consider connections which are being prepared (#6746 ) This fixes a regression caused by KAFKA-8275. The least loaded node selection should take into account nodes which are currently being connect to. This includes both the CONNECTING and CHECKING_API_VERSIONS states since `canSendRequest` would return false in either case. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Mickael Maison	855f899bb5	KAFKA-8256; Replace Heartbeat request/response with automated protocol (#6691 ) Reviewers: Boyang Chen <bchen11@outlook.com>, Jason Gustafson <jason@confluent.io>	6 years ago
sandmannn	b96aa003b6	MINOR: Added missing method parameter to `performAssignment` javadoc (#6744 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	2208f9966d	KAFKA-8354; Replace Sync group request/response with automated protocol (#6729 ) Update SyncGroup API to use the generated protocol classes. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago

1 2 3 4 5 ...

1557 Commits (03d61ebfb93aab53b0b0ecdfc77175d18a58e861)