src-kafka

Commit Graph

Author	SHA1	Message	Date
Mickael Maison	14d854936e	KAFKA-8390: Use automatic RPC generation in CreateDelegationToken (#6828 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Ismael Juma	fc27cbe415	KAFKA-8545: Remove legacy ZkUtils (#6948 ) `ZkUtils` is not used by the broker, has been deprecated since 2.0.0 and it was never intended as a public API. We should remove it along with `AdminUtils` methods that rely on it. Reviewers: David Arthur <mumrah@gmail.com>	5 years ago
Guozhang Wang	3e9d1c1411	KAFKA-8106: Skipping ByteBuffer allocation of key / value / headers in logValidator (#6785 ) * KAFKA-8106:Reducing the allocation and copying of ByteBuffer when logValidator do validation. * KAFKA-8106:Reducing the allocation and copying of ByteBuffer when logValidator do validation. * github comments * use batch.skipKeyValueIterator * cleanups * no need to skip kv for uncompressed iterator * checkstyle fixes * fix findbugs * adding unit tests * reuse decompression buffer; and using streaming iterator * checkstyle * add unit tests * remove reusing buffer supplier * fix unit tests * add unit tests * use streaming iterator * minor refactoring * rename * github comments * github comments * reuse buffer at DefaultRecord caller * some further optimization * major refactoring * further refactoring * update comment * github comments * minor fix * add jmh benchmarks * update jmh * github comments * minor fix * github comments	5 years ago
Ismael Juma	1233c963f8	MINOR: Remove legacy kafka.admin.AdminClient (#6947 ) It has been deprecated since 0.11.0, it was never meant as a publicly supported API and people should use `org.apache.kafka.clients.admin.AdminClient` instead. Its presence causes confusion and people still use them accidentally at times. `BrokerApiVersionsCommand` uses one method that is not available in `org.apache.kafka.clients.admin.AdminClient`, we inline it for now. Reviewers: David Arthur <mumrah@gmail.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Vikas Singh	d40805deac	MINOR: Fix Partition::toString method (#6971 ) The AllReplicas was only printing remote replica ids. This change prints all ids, including local one. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>	5 years ago
Mickael Maison	635c21311f	KAFKA-8564; Fix NPE on deleted partition dir when no segments remain (#6968 ) Kafka should not NPE while loading a deleted partition dir with no log segments. This patch ensures that there will always be at least one segment after initialization. Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com> Co-authored-by: Mickael Maison <mickael.maison@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	5 years ago
Boyang Chen	2c9a15045e	MINOR: fix consumer group failure message typo (#6962 ) Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
jolshan	33e39de4ad	KAFKA-8448: Cancel PeriodicProducerExpirationCheck when closing a Log instance (#6847 ) Cancel PeriodicProducerExpirationCheck when closing a Log instance, to avoid a memory leak. Add a method to KafkaScheduler to make this possible. Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Jason Gustafson	52cd59bdb4	MINOR: Simplify controller election utilities (#6944 ) This patch simplifies the controller election API. We were passing `LeaderIsrAndControllerEpoch` into the election utilities even though we just needed `LeaderAndIsr`. We also remove some unneeded collection copies `doElectLeaderForPartitions`. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Vikas Singh	57baa4079d	KAFKA-8457; Move `Log' reference from `Replica` into `Partition` (#6841 ) A `Partition` object contain one or many `Replica` objects. These replica objects in turn can have the "log" if the replica corresponds to the local node. All the code in Partition or ReplicaManager peek into replica object to fetch the log if they need to operate on that. As replica object can represent a local replica or a remote one, this lead to a bunch of "if-else" code in log fetch and offset update code. NOTE: In addition to a "log" that is in use during normal operation, if an alter log directory command is issued, we also create a future log object. This object catches up with local log and then we switch the log directory. So temporarily a Partition can have two local logs. Before this change both logs are inside replica objects. This change is an attempt to untangle this relationship. In particular it moves `Log` from `Replica` into `Partition`. So a partition contains a local log to which all writes go and possibly a "future log" if the partition is being moved between directories. Additionally, it maintains a list of remote replicas for offset and "caught up time" data that it uses for replication protocol. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>	5 years ago
Colin P. Mccabe	e047864f30	MINOR: fix some warnings in the broker Author: Colin P. Mccabe <cmccabe@confluent.io> Reviewers: Gwen Shapira Closes #6942 from cmccabe/fix-scala-warnings	6 years ago
Boyang Chen	f396372fb8	MINOR: add group coordinator test coverage (#6926 ) Some edge cases are not currently being tested. Add more tests to cover. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
José Armando García Sancio	1cc0b5eb81	MINOR: Seal the HostedPartition enumeration (#6917 ) Makes HostedPartition a sealed trait and make all of the match cases explicit. Reviewers: Vikas Singh <soondenana@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	e981b82601	KAFKA-8500; Static member rejoin should always update member.id (#6899 ) This PR fixes a bug in static group membership. Previously we limit the `member.id` replacement in JoinGroup to only cases when the group is in Stable. This is error-prone and could potentially allow duplicate consumers reading from the same topic. For example, imagine a case where two unknown members join in the `PrepareRebalance` stage at the same time. The PR fixes the following things: 1. Replace `member.id` at any time we see a known static member rejoins group with unknown member.id 2. Immediately fence any ongoing join/sync group callback to early terminate the duplicate member. 3. Clearly handle Dead/Empty cases as exceptional. 4. Return old leader id upon static member leader rejoin to avoid trivial member assignment being triggered. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Gardner Vickers	cb2c4c3d7a	MINOR: Remove integration package prefix from ConsumerTopicCreationTest (#6916 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	6 years ago
Guozhang Wang	bebcbe3a04	KAFKA-8487: Only request re-join on REBALANCE_IN_PROGRESS in CommitOffsetResponse (#6894 ) Plus some minor cleanups on AbstractCoordinator. Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	783ab74793	KAFKA-8333; Load high watermark checkpoint lazily when initializing replicas (#6800 ) Currently we load the high watermark checkpoint separately for every replica that we load. This patch makes this loading logic lazy and caches the loaded map while a LeaderAndIsr request is being handled. Reviewers: Jun Rao <junrao@gmail.com>	6 years ago
Jason Gustafson	1e5227c230	MINOR: Fix transient failure in PreferredReplicaLeaderElectionCommandTest (#6908 ) We have seen this failing recently due to the follower error: ``` java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:366) at scala.None$.get(Option.scala:364) at kafka.admin.PreferredReplicaLeaderElectionCommandTest.getLeader(PreferredReplicaLeaderElectionCommandTest.scala:101) at kafka.admin.PreferredReplicaLeaderElectionCommandTest.testNoopElection(PreferredReplicaLeaderElectionCommandTest.scala:240) ``` We need to wait for the leader to be available. Reviewers: David Arthur <mumrah@gmail.com>	6 years ago
Jason Gustafson	bb8de0b8c5	KAFKA-8003; Fix flaky testFencingOnTransactionExpiration We see this failure from time to time: ``` java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at kafka.api.TransactionsTest.testFencingOnTransactionExpiration(TransactionsTest.scala:512) ``` The cause is probably that we are using `consumeRecordsFor` which has no expectation on the number of records to fetch and a timeout of just 1s. This patch changes the code to use `consumeRecords` and the default 15s timeout. Note we have also fixed a bug in the test case itself, which was using the wrong topic for the second write, which meant it could never have failed in the anticipated way anyway. Author: Jason Gustafson <jason@confluent.io> Reviewers: Gwen Shapira Closes #6905 from hachikuji/fix-flaky-transaction-test	6 years ago
Jason Gustafson	07dd7db324	MINOR: Increase timeout for flaky ResetConsumerGroupOffsetTest (#6900 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	6 years ago
Manikumar Reddy	060701fb6f	KAFKA-8461: Wait for follower to join the ISR in testUncleanLeaderElectionDisabled Test Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: Jason Gustafson <jason@confluent.io>, Boyang Chen <boyang@confluent.io> Closes #6887 from omkreddy/unclean-leader	6 years ago
José Armando García Sancio	28eccc3ca3	KAFKA-8384; Leader election command integration tests (#6880 ) This patch adds test cases for the leader election command added in KIP-460. Reviewers: Vikas Singh, David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	72c3b88ed5	MINOR: Fixed compiler warnings in LogManagerTest (#6897 ) Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Wennn	562424cee8	MINOR: Remove redundant semicolons in `KafkaApis` imports (#6889 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	0e95c9f3a8	KAFKA-8400; Do not update follower replica state if the log read failed (#6814 ) This patch checks for errors handling a fetch request before updating follower state. Previously we were unsafely passing the failed `LogReadResult` with most fields set to -1 into `Replica` to update follower state. Additionally, this patch attempts to improve the test coverage for ISR shrinking and expansion logic in `Partition`. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Boyang Chen	fb1f74958d	KAFKA-8386; Use COORDINATOR_NOT_AVAILABLE error when group is Dead (#6762 ) The Dead state in the coordinator is used for groups which are either pending deletion or migration to a new coordinator. Currently requests received while in this state result in an UNKNOWN_MEMBER_ID which causes consumers to reset the memberId. This is a problem for KIP-345 since it can cause an older member to fence a newer member. This patch changes the error code returned in this state to COORDINATOR_NOT_AVAILABLE, which causes the consumer to rediscover the coordinator, but not reset the memberId. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Almog Gavra	8e161580b8	KAFKA-8305; Support default partitions & replication factor in AdminClient#createTopic (KIP-464) (#6728 ) This commit makes three changes: - Adds a constructor for NewTopic(String, Optional<Integer>, Optional<Short>) which allows users to specify Optional.empty() for numPartitions or replicationFactor in order to use the broker default. - Changes AdminManager to accept -1 as valid options for replication factor and numPartitions (resolving to broker defaults). - Makes --partitions and --replication-factor optional arguments when creating topics using kafka-topics.sh. - Adds a dependency on scalaJava8Compat library to make it simpler to convert Scala Option to Java Optional Reviewers: Ismael Juma <ismael@juma.me.uk>, Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Colin P. Mccabe	e6563aab72	KAFKA-4893; Fix deletion and moving of topics with long names Author: Colin P. Mccabe <cmccabe@confluent.io> Reviewers: Gwen Shapira, David Arthur, James Cheng, Vahid Hashemian Closes #6869 from cmccabe/KAFKA-4893	6 years ago
Guozhang Wang	573152dfa8	HOTFIX: Allow multi-batches for old format and no compression (#6871 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
José Armando García Sancio	45fae33937	KAFKA-8383; Integration tests for unclean `electLeaders` (#6857 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
José Armando García Sancio	121308cc7a	KAFKA-8286; Generalized Leader Election Admin RPC (KIP-460) (#6686 ) Implements KIP-460: https://cwiki.apache.org/confluence/display/KAFKA/KIP-460%3A+Admin+Leader+Election+RPC. Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	348e5d738f	MINOR: Fix transient failure in PartitionTest.testAddAndRemoveMetrics (#6834 ) The test was failing because other test cases are leaving some metrics in the registry. This patch cleans the registry before and after test runs. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Stanislav Kozlovski	efeefa5d18	MINOR: Check invalid bootstrap-server, alter option and config flags before checking for the required --partitions flag (#6786 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	6 years ago
Guozhang Wang	592410fc58	KAFKA-8428: Always require a single batch with un- / compressed messages (#6816 ) I think it's better just to make single-batch as a universal requirement for all versions for compressed messages, and for V2 and beyond uncompressed messages as well. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	4a3c92bfc7	MINOR: Remove redundant checkpoint thread started field in ReplicaManager (#6813 ) We have two fields `highWatermarkCheckPointThreadStarted` and `hwThreadInitialized` which appear to be serving the same purpose. This patch gets rid of `hwThreadInitialized`. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Jason Gustafson	6afb0ca735	KAFKA-8351; Cleaner should handle transactions spanning multiple segments (#6722 ) When cleaning transactional data, we need to keep track of which transactions still have data associated with them so that we do not remove the markers. We had logic to do this, but the state was not being carried over when beginning cleaning for a new set of segments. This could cause the cleaner to incorrectly believe a transaction marker was no longer needed. The fix here carries the transactional state between groups of segments to be cleaned. Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	a1808962e5	KAFKA-8422; Client should send OffsetForLeaderEpoch only if broker supports latest version (#6806 ) In the olden days, OffsetForLeaderEpoch was exclusively an inter-broker protocol and required Cluster level permission. With KIP-320, clients can use this API as well and so we lowered the required permission to Topic Describe. The only way the client can be sure that the new permissions are in use is to require version 3 of the protocol which was bumped for 2.3. If the broker does not support this version, we skip the validation and revert to the old behavior. Additionally, this patch fixes a problem with the newly added replicaId field when parsed from older versions which did not have it. If the field was not present, then we used the consumer's sentinel value, but this would limit the range of visible offsets by the high watermark. To get around this problem, this patch adds a separate "debug" sentinel similar to APIs like Fetch and ListOffsets. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Lee Dongjin	eefb25d4ca	MINOR: Fix a few compiler warnings (#6767 ) Reviewers: Boyang Chen <bchen11@outlook.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	3696b9882d	KAFKA-8371: Remove dependence on ReplicaManager from Partition (#6705 ) This patch attempts to simplify the interaction between Partition and the various components from `ReplicaManager`. This is primarily to make unit testing easier. I have also tried to eliminate the OfflinePartition sentinel which has always been unsafe. Reviewers: Boyang Chen <bchen11@outlook.com>, David Arthur <mumrah@gmail.com>	6 years ago
Jason Gustafson	5f06999cf3	MINOR: Remove ControllerEventManager metrics on close (#6788 ) Remove created metrics when shutting down `ControllerEventManager`. This fixes transient failures in `ControllerEventManagerTest.testEventQueueTime` and is generally good hygiene. Reviewers: José Armando García Sancio <jsancio@gmail.com>, Ismael Juma <ismael@juma.me.uk>	6 years ago
David Arthur	bacb45e044	MINOR: Set `replicaId` for OffsetsForLeaderEpoch from followers (#6775 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	ce5ce2d569	MINOR: A few logging improvements in the broker (#6773 ) Reviewers: Boyang Chen <bchen11@outlook.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Manikumar Reddy	d77bac1c93	KAFKA-3143: Controller should transition offline replicas on startup Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io> Closes #5041 from omkreddy/KAFKA-3143	6 years ago
Boyang Chen	e00c0d316d	MINOR: Fix typo in heartbeat request protocol definition (#6759 ) This changes the field "generationid" to "generationId" to be consistent with other uses. Reviewers: Shaobo Liu <lambda.tencent@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	9fa331b811	KAFKA-8225 & KIP-345 part-2: fencing static member instances with conflicting group.instance.id (#6650 ) For static members join/rejoin, we encode the current timestamp in the new member.id. The format looks like group.instance.id-timestamp. During consumer/broker interaction logic (Join, Sync, Heartbeat, Commit), we shall check the whether group.instance.id is known on group. If yes, we shall match the member.id stored on static membership map with the request member.id. If mismatching, this indicates a conflict consumer has used same group.instance.id, and it will receive a fatal exception to shut down. Right now the only missing part is the system test. Will work on it offline while getting the major logic changes reviewed. Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
David Arthur	e2847e8603	KAFKA-8365; Consumer and protocol support for follower fetching (#6731 ) This patch includes API changes for follower fetching per [KIP-392](https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica) as well as the consumer implementation. After this patch, consumers will continue to fetch only from the leader, since the broker implementation to select an alternate read replica is not included here. Adds new `client.rack` consumer configuration property is added which allows the consumer to indicate its rack. This is just an arbitrary string to indicate some relative location, it doesn't have to actually represent a physical rack. We are keeping the naming consistent with the broker property (`broker.rack`). FetchRequest now includes `rack_id` which can optionally be specified by the consumer. FetchResponse includes an optional `preferred_read_replica` field for each partition in the response. OffsetForLeaderEpochRequest also adds new `replica_id` field which is similar to the same field in FetchRequest. When the consumer sees a `preferred_read_replica` in a fetch response, it will use the Node with that ID for the next fetch. Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
Jason Gustafson	d6057a9fe4	MINOR: Remove spammy log message during topic deletion Deletion of a large number of topics can cause a ton of log spam. In a test case on 2.2, deletion of 50 topics with 100 partitions each caused about 158 Mb of data in the controller log. With the improvements to batch StopReplica and the patch here, we reduce that to about 1.5 Mb. Kudos to gwenshap for spotting these spammy messages. Author: Jason Gustafson <jason@confluent.io> Reviewers: Gwen Shapira Closes #6738 from hachikuji/remove-verbose-topic-deletion-log-message	6 years ago
Aishwarya Gune	414852c701	KAFKA-8346; Improve replica fetcher behavior for handling partition failure [KIP-461] (#6716 ) The replica fetcher thread is terminated in case a partition crashes which leads to under replication. This behavior can be improved by dropping the failed partition. The thread can continue monitoring the rest of the partitions. If all partitions of a thread have failed, the thread would be shut down. This is documented in KIP-461: https://cwiki.apache.org/confluence/display/KAFKA/KIP-461+-+Improve+Replica+Fetcher+behavior+at+handling+partition+failure. Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Mickael Maison	855f899bb5	KAFKA-8256; Replace Heartbeat request/response with automated protocol (#6691 ) Reviewers: Boyang Chen <bchen11@outlook.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Boyang Chen	6e6dcceb93	KAFKA-8220; Avoid kicking out static group members through rebalance timeout (#6666 ) To make static consumer group members more persistent, we want to avoid kicking out unjoined members through rebalance timeout. Essentially we allow static members to participate in a rebalance using their old subscription without sending a JoinGroup. The only catch is that an unjoined static member might be the current group leader, and we may need to elect a different leader. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Jason Gustafson <jason@confluent.io>	6 years ago

1 2 3 4 5 ...

2621 Commits (14d854936e1d2fed2e69a7c6367becf360f88833)