src-kafka

Commit Graph

Author	SHA1	Message	Date
Max Riedel	90e646052a	KAFKA-14509; [1/2] Define ConsumerGroupDescribe API request and response schemas and classes. (#14124 ) This patch adds the schemas of the new ConsumerGroupDescribe API (KIP-848) and adds an handler to the KafkaApis. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>	1 year ago
David Jacot	7054625c45	KAFKA-14499: [6/N] Add MemberId and MemberEpoch to OffsetFetchRequest (#14321 ) This patch adds the MemberId and the MemberEpoch fields to the OffsetFetchRequest. Those fields will be populated when the new consumer group protocol is used to ensure that the member fetching the offset has the correct member id and epoch. If it does not, UNKNOWN_MEMBER_ID or STALE_MEMBER_EPOCH are returned to the client. Our initial idea was to implement the same for the old protocol. The field is called GenerationIdOrMemberEpoch in KIP-848 to materialize this. As a second though, I think that we should only do it for the new protocol. The effort to implement it in the old protocol is not worth it in my opinion. Reviewers: Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, Justine Olshan <jolshan@confluent.io>	1 year ago
Andrew Schofield	b49013b73e	KAFKA-9800: Exponential backoff for Kafka clients - KIP-580 (#14111 ) Implementation of KIP-580 to add exponential back-off to situations in which retry.backoff.ms is used to delay backoff attempts. This KIP adds exponential backoff behavior with a maximum controlled by a new config retry.backoff.max.ms, together with a +/- 20% of jitter to spread the retry attempts of the client fleet. Reviewers: Mayank Shekhar Narula <mayanks.narula@gmail.com>, Milind Luthra <i.milind.luthra@gmail.com>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>	1 year ago
Lianet Magrans	1bb8c11f5a	KAFKA-14965 - OffsetsRequestsManager implementation & API integration (#14308 ) Implementation of the OffsetRequestsManager, responsible for building requests and processing responses for requests related to partition offsets. In this PR, the manager includes support for ListOffset requests, generated when the user makes any of the following consumer API calls: beginningOffsets endOffsets offsetsForTimes All previous consumer API calls interact with the OffsetsRequestsManager by generating a ListOffsetsApplicationEvent. Includes tests to cover the new functionality and to ensure no API level changes are introduced. This covers KAFKA-14965 and KAFKA-15081. Reviewers: Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>	1 year ago
Philip Nee	945d21953e	KAFKA-14875: Implement wakeup (#14118 ) Summary Implemented wakeup() mechanism using a WakeupTrigger class to store the pending wakeup item, and when wakeup() is invoked, it checks whether there's an active task or a wakeup task. If there's an active task: the task will be completed exceptionally and the atomic reference will be freed up. If there an wakedup task, which means wakeup() was invoked before a blocking call was issued. Therefore, the current task will be completed exceptionally immediately. This PR also addressed minor issues such as: Throwing WakeupException at the right place: As wakeups are thrown by completing an active future exceptionally. The WakeupException is wrapped inside of the ExecutionException. mockConstruction is a thread-lock mock; therefore, we need to free up the reference before completing the test. Otherwise, other tests will continue using the thread-lock mock. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com>	1 year ago
Calvin Liu	b41b2dfcf2	KAFKA-15353: make sure AlterPartitionRequest.build() is idempotent (#14236 ) As described in https://issues.apache.org/jira/browse/KAFKA-15353 When the AlterPartitionRequest version is < 3 and its builder.build is called multiple times, both newIsrWithEpochs and newIsr will all be empty. This can happen if the sender retires on errors. Reviewers: Luke Chen <showuon@gmail.com>	1 year ago
Maros Orsak	5785796f98	MINOR: Add a few test cases to clients (#14211 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	1 year ago
Phuc-Hong-Tran	8d12c1175c	KAFKA-15152: Fix incorrect format specifiers when formatting string (#14026 ) Reviewers: Divij Vaidya <diviv@amazon.com> Co-authored-by: phuchong.tran <phuchong.tran@servicenow.com>	1 year ago
Mehari Beyene	25b128de81	KAFKA-14991: KIP-937-Improve message timestamp validation (#14135 ) This implementation introduces two new configurations `log.message.timestamp.before.max.ms` and `log.message.timestamp.after.max.ms` and deprecates `log.message.timestamp.difference.max.ms`. The default value for all these three configs is maintained to be Long.MAX_VALUE for backward compatibility but with the newly added configurations we can have a finer control when validating message timestamps that are in the past and the future compared to the broker's timestamp. To maintain backward compatibility if the default value of `log.message.timestamp.before.max.ms` is not changed, we are assuming users are still using the deprecated config `log.message.timestamp.difference.max.ms` and validation is done using its value. This ensures that existing customers who have customized the value of `log.message.timestamp.difference.max.ms` will continue to see no change in behavior. Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>	1 year ago
Okada Haruki	87a30b73b5	KAFKA-15391: Handle concurrent dir rename which makes log-dir to be offline unexpectedly (#14280 ) A race condition between async flush and segment rename (for deletion purpose) might cause the entire log directory to be marked offline when we delete a topic. This PR fixes the bug by ignoring NoSuchFileException when we flush a directory. Reviewers: Divij Vaidya <diviv@amazon.com>	1 year ago
olalamichelle	9972297e51	KAFKA-14780: Fix flaky test 'testSecondaryRefreshAfterElapsedDelay' (#14078 ) "The test RefreshingHttpsJwksTest#testSecondaryRefreshAfterElapsedDelay relies on the actual system clock, which makes it frequently fail. The fix adds a second constructor that allows for passing a ScheduledExecutorService to manually execute the scheduled tasks before refreshing. The fixed task is much more robust and stable. Co-authored-by: Fei Xie <feixie@MacBook-Pro.attlocal.net> Reviewers: Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>	1 year ago
Lianet Magrans	e9f358eef6	KAFKA-14937; [2/N]: Refactoring for client code to reduce boilerplate (#14218 ) This PR main refactoring relates to : 1. serializers/deserializers used in clients - unified in a Deserializers class 2. logic for configuring ClusterResourceListeners moved to ClientUtils 3. misc refactoring of the new async consumer in preparation for upcoming Request Managers Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
Proven Provenzano	c2759df067	KAFKA-15219: KRaft support for DelegationTokens (#14083 ) Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>	1 year ago
José Armando García Sancio	3f4816dd3e	KAFKA-15345; KRaft leader notifies leadership when listener reaches epoch start (#14213 ) In a non-empty log the KRaft leader only notifies the listener of leadership when it has read to the leader's epoch start offset. This guarantees that the leader epoch has been committed and that the listener has read all committed offsets/records. Unfortunately, the KRaft leader doesn't do this when the log is empty. When the log is empty the listener is notified immediately when it has become leader. This makes the API inconsistent and harder to program against. This change fixes that by having the KRaft leader wait for the listener's nextOffset to be greater than the leader's epochStartOffset before calling handleLeaderChange. The RecordsBatchReader implementation is also changed to include control records. This makes it possible for the state machine learn about committed control records. This additional information can be used to compute the committed offset or for counting those bytes when determining when to snapshot the partition. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>	1 year ago
Philip Nee	b97e8203eb	MINOR: CommitRequestManager should only poll when the coordinator node is known (#14179 ) As title, we discovered a flaky bug during testing that the commit request manager would seldomly throw a NOT_COORDINATOR exception, which means the request was routed to a non-coordinator node. We discovered that if we don't check the coordinator node in the commitRequestManager, the request manager will pass on an empty node to the NetworkClientDelegate, which implies the request can be sent to any node in the cluster. This behavior is incorrect as the commit requests need to be routed to a coordinator node. Because the timing coordinator's discovery during integration testing isn't entirely deterministic; therefore, the test became extremely flaky. After fixing this: The coordinator node is mandatory before attempt to enqueue these commit request to the NetworkClient. Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
Greg Harris	1a001c1e88	KAFKA-15336: Add ServiceLoader Javadocs for Connect plugins (#14194 ) Reviewers: Chris Egerton <chrise@aiven.io>	1 year ago
Kirk True	67b527460e	KAFKA-14937: Refactoring for client code to reduce boilerplate (#13990 ) Move common code from the client implementations to the ClientUtils class or (consumer) Utils class, where passible. There are a number of places in the client code where the same basic calls are made by more than one client implementation. Minor refactoring will reduce the amount of boilerplate code necessary for the client to construct its internal state. Reviewers: Lianet Magrans <lianetmr@gmail.com>, Jun Rao <junrao@gmail.com>	1 year ago
vveicc	43751d8d05	KAFKA-15289: Support KRaft mode in RequestQuotaTest (#14201 ) Enable kraft mode for RequestQuotaTest, there are 2 works left to be done. Reviewers: dengziming <dengziming1993@gmail.com>	1 year ago
Philip Nee	f6b8b39747	MINOR: Fix committed API in the PrototypeAsyncConsumer timeout (#14123 ) Discovered the committed() API timeout during the integration test. After investigation, this is because the future was not completed in the ApplicationEventProcessor. Also added toString methods to the event class for debug purposes. Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
vveicc	594156e01b	KAFKA-15287: Change NodeApiVersions.create() to support both zk and kraft (#14185 ) Reviewers: dengziming <dengziming1993@gmail.com>	1 year ago
Federico Valeri	8de3e0436a	KAFKA-15239: Fix system tests using producer performance service (#14092 ) Reviewers: Greg Harris <greg.harris@aiven.io>	1 year ago
vveicc	393b563bb5	KAFKA-15288: Change BrokerApiVersionsCommandTest to support kraft mode (#14175 ) Use ApiKeys.clientApis() to replace ApiKeys.zkBrokerApis() to support kraft mode. Reviewers: dengziming <dengziming1993@gmail.com>	1 year ago
Qichao Chu	c72065a632	MINOR: Add test for describe topic with ID (#14110 ) * MINOR: Add test for describe topic with ID Add a simple test to verify topic description with topic IDs. Reviewers: Divij Vaidya <diviv@amazon.com>, dengziming <dengziming1993@gmail.com>	1 year ago
flashmouse	e0b7499103	KAFKA-15106: Fix AbstractStickyAssignor isBalanced predict (#13920 ) in 3.5.0 AbstractStickyAssignor may run useless loop in performReassignments because isBalanced have a trivial mistake, and result in rebalance timeout in some situation. Co-authored-by: lixy <lixy@tuya.com> Reviewers: Ritika Reddy <rreddy@confluent.io>, Philip Nee <pnee@confluent.io>, Kirk True <kirk@mustardgrain.com>, Guozhang Wang <wangguoz@gmail.com>	1 year ago
Philip Nee	811ae01723	MINOR: Test assign() and assignment() in the integration test (#14086 ) A missing piece from KAFKA-14950. This is to test assign() and assignment() in the integration test. Also fixed an accidental mistake in the committed API. Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
James Shaw	afe631cd73	KAFKA-14967: fix NPE in CreateTopicsResult in MockAdminClient (#13671 ) Co-authored-by: James Shaw <james.shaw@masabi.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>	1 year ago
Justine Olshan	6f39ef02ca	MINOR: Adjust Invalid Record Exception for Invalid Txn State as mentioned in KIP-890 (#14088 ) Invalid record is a newer error. INVALID_TXN_STATE has been around as long as transactions and is not retriable. This is the desired behavior.	1 year ago
David Jacot	29825ee24f	KAFKA-14499: [3/N] Implement OffsetCommit API (#14067 ) This patch introduces the `OffsetMetadataManager` and implements the `OffsetCommit` API for both the old rebalance protocol and the new rebalance protocol. It introduces version 9 of the API but keeps it as unstable for now. The patch adds unit tests to test the API. Integration tests will be done separately. Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>	1 year ago
vamossagar12	ff390ab60a	[MINOR] Fix Javadoc comment in KafkaFuture#toCompletionStage (#14100 ) Fix Javadoc comment in KafkaFuture#toCompletionStage Reviewers: Luke Chen <showuon@gmail.com>	1 year ago
tison	8b027b6fef	MINOR: Fix typo in ProduceRequest.json (#14070 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	1 year ago
Philip Nee	1656591d0b	KAFKA-14950: implement assign() and assignment() (#13797 ) We will explicitly send an assignment change event to the background thread to invoke auto-commit if the group.id is configured. After updating the subscription state, a NewTopicsMetadataUpdateRequestEvent will also be sent to the background thread to update the metadata. Co-authored-by: Kirk True <kirk@kirktrue.pro> Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
David Jacot	69659b70fc	KAFKA-14499: [1/N] Introduce OffsetCommit API version 9 and add new StaleMemberEpochException error (#14046 ) This patch does a few things: 1) It introduces version 9 of the OffsetCommit API. This new version has no schema changes but it can return a StaleMemberEpochException if the new consumer group protocol is used. Note the use of `"latestVersionUnstable": true` in the request schema. This means that this new version is not available yet unless activated. 2) It renames the `generationId` field in the request to `GenerationIdOrMemberEpoch`. This is backward compatible change. 3) It introduces the new StaleMemberEpochException error. 4) It does a minor refactoring in OffsetCommitRequest class. Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Arthur <mumrah@gmail.com>, Justine Olshan <jolshan@confluent.io>	1 year ago
Greg Harris	125dbb9286	KAFKA-14760: Move ThroughputThrottler from tools to clients, remove tools dependency from connect-runtime (#13313 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	1 year ago
Jeff Kim	a500c3ecf9	KAFKA-14500; [5/N] Implement JoinGroup protocol in new GroupCoordinator (#13870 ) This patch implements the existing JoinGroup protocol within the new group coordinator. Some notable differences: * Methods return a CoordinatorResult to the runtime framework, which includes records to append to the log as well as a future to complete after the append succeeds/fails. * The coordinator runtime ensures that only a single thread will be processing a group at any given time, therefore there is no more locking on groups. * Instead of using on purgatories, we rely on the Timer interface to schedule/cancel delayed operations. Reviewers: David Jacot <djacot@confluent.io>	1 year ago
Max Riedel	15418db69d	KAFKA-15123: Add tests for ChunkedBytesStream (#13941 ) Reviewers: Divij Vaidya <diviv@amazon.com>	1 year ago
Justine Olshan	ea0bb00126	KAFKA-14884: Include check transaction is still ongoing right before append (take 2) (#13787 ) Introduced extra mapping to track verification state. When verifying, there is a race condition that the add partitions verification response returns that the partition is in the ongoing transaction, but an abort marker is written before we get to append. Therefore, we track any given transaction we are verifying with an object unique to that transaction. We check this unique state upon the first append to the log. After that, we can rely on currentTransactionFirstOffset. We remove the verification state on appending to the log with a transactional data record or marker. We will also clean up lingering verification state entries via the producer state entry expiration mechanism. We do not update the the timestamp on retrying a verification for a transaction, so each entry must be verified before producer.id.expiration.ms. There were a few other fixes: - Moved the transaction manager handling for failed batch into the future completed exceptionally block to avoid processing it twice (this caused issues in unit tests) - handle interrupted exceptions encountered when callback thread encountered them - change handling to throw error if we try to set verification state and leaderLogIfLocal is None. Reviewers: David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>	1 year ago
Okada Haruki	ab71c56973	KAFKA-12261: Mention about potential delivery loss on increasing partition when auto.offset.reset = latest (#10167 ) Splitting partitions while setting auto.offset.reset to latest may cause message delivery loss, but users might not be aware about that since currently it isn't documented anywhere. Reviewers: Luke Chen <showuon@gmail.com>	1 year ago
Alyssa Huang	5b5f6fcafb	[KAFKA-15137] Do not log entire request payload in KRaftControllerChannelManager (#13988 ) Reviewers: David Arthur <mumrah@gmail.com>	1 year ago
Cheryl Simmons	e98508747a	Doc fixes: Fix format and other small errors in config documentation (#13661 ) Various formatting fixes in the config docs. Reviewers: Bill Bejeck <bbejeck@apache.org>	1 year ago
Lianet Magrans	4a61b48d3d	KAFKA-14966; [2/N] Extract OffsetFetcher reusable logic (#13898 ) This is a follow up on the initial OffsetFetcher refactoring to extract reusable logic, needed for the new consumer implementation (initial refactoring merged with PR-13815). Similar to the initial refactoring, this PR brings no changes to the existing logic, just extracting functions or pieces of logic. There were no individual tests for the extracted functions, so no tests were migrated. Reviewers: Jun Rao <junrao@gmail.com>	1 year ago
Gantigmaa Selenge	b2d647904c	KAFKA-8982: Add retry of fetching metadata to Admin.deleteRecords (#13760 ) Use AdminApiDriver class to refresh the metadata and retry the request that failed with retriable errors. Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Mickael Maison <mmaison@redhat.com>, Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com>	1 year ago
Ismael Juma	1f4cbc5d53	MINOR: Add JDK 20 CI build and remove some branch builds (#12948 ) It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS. Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included. Also remove some branch builds that add load to the CI, but have low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated), JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated). A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly, we upgrade easymock when the Java version is 16 or newer as it's incompatible with powermock (which doesn't support Java 16 or newer). Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN). Finally, fixed some lossy conversions that were added after #13582 was submitted. Reviewers: Ismael Juma <ismael@juma.me.uk>	1 year ago
Kirk True	a81f35c1c8	KAFKA-14831: Illegal state errors should be fatal in transactional producer (#13591 ) Poison the transaction manager if we detect an illegal transition in the Sender thread. A ThreadLocal in is stored in TransactionManager so that the Sender can inform TransactionManager which thread it's using. Reviewers: Daniel Urban <durban@cloudera.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>	1 year ago
Bo Gao	005416879e	KAFKA-15053: Use case insensitive validator for security.protocol config (#13831 ) Fixed a regression described in KAFKA-15053 that security.protocol only allows uppercase values like PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. With this fix, both lower case and upper case values will be supported (e.g. PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL, plaintext, ssl, sasl_plaintext, sasl_ssl) Reviewers: Chris Egerton <chrise@aiven.io>, Divij Vaidya <diviv@amazon.com>	1 year ago
José Armando García Sancio	ee88a3d1b9	MINOR; Failed atomic file move should be logged at WARN (#13917 ) When Kafka fails to perform an atomic file move the error is getting swallowed. Kafka should log these cases at least at WARN level. Reviewers: Ron Dagostino <rndgstn@gmail.com>, Kirk True <kirk@kirktrue.pro>	1 year ago
Justine Olshan	2f71708955	KAFKA-15028: AddPartitionsToTxnManager metrics (#13798 ) Adding the following metrics as per kip-890: VerificationTimeMs – number of milliseconds from adding partition info to the manager to the time the response is sent. This will include the round trip to the transaction coordinator if it is called. This will also account for verifications that fail before the coordinator is called. VerificationFailureRate – rate of verifications that returned in failure either from the AddPartitionsToTxn response or through errors in the manager. AddPartitionsToTxnVerification metrics – separating the verification request metrics from the typical add partitions ones similar to how fetch replication and fetch consumer metrics are separated. Reviewers: Divij Vaidya <diviv@amazon.com>	1 year ago
minjian.cai	e71f68d6c9	MINOR: fix typos for client (#13884 ) Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kirk True <ktrue@confluent.io>	1 year ago
Manyanda Chitimbo	c5889fcedd	MINOR: Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync (#13665 ) Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync Reviewers: Luke Chen <showuon@gmail.com>	1 year ago
Jeff Kim	1dbcb7da9e	KAFKA-14694: RPCProducerIdManager should not wait on new block (#13267 ) RPCProducerIdManager initiates an async request to the controller to grab a block of producer IDs and then blocks waiting for a response from the controller. This is done in the request handler threads while holding a global lock. This means that if many producers are requesting producer IDs and the controller is slow to respond, many threads can get stuck waiting for the lock. This patch aims to: * resolve the deadlock scenario mentioned above by not waiting for a new block and returning an error immediately * remove synchronization usages in RpcProducerIdManager.generateProducerId() * handle errors returned from generateProducerId() so that KafkaApis does not log unexpected errors * confirm producers backoff before retrying * introduce backoff if manager fails to process AllocateProducerIdsResponse Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>	1 year ago
Ismael Juma	9c8aaa2c35	MINOR: Fix lossy conversions flagged by Java 20 (#13582 ) An example of the warning: > warning: [lossy-conversions] implicit cast from long to int in compound assignment is possibly lossy There should be no change in behavior as part of these changes - runtime logic ensured we didn't run into issues due to the lossy conversions. Reviewers: Divij Vaidya <diviv@amazon.com>	1 year ago

1 2 3 4 5 ...

2869 Commits (trunk)