src-kafka

Commit Graph

Author	SHA1	Message	Date
Manikumar Reddy O	028e80204d	KAFKA-6835: Enable topic unclean leader election to be enabled without controller change (#4957 ) Reviewers: Jun Rao <junrao@gmail.com>	6 years ago
Jason Gustafson	8325046be2	KAFKA-7298; Raise UnknownProducerIdException if next sequence number is unknown (#5518 ) If the only producer state left in the log is a transaction marker, then we do not know the next expected sequence number. This can happen if there is a call to DeleteRecords which arrives prior to the writing of the marker. Currently we raise an OutOfOrderSequence error when this happens, but this is treated as a fatal error by the producer. Raising UnknownProducerId instead allows the producer to check for truncation using the last acknowledged sequence number and reset if possible. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Manikumar Reddy O	914ffa9dbe	KAFKA-7210: Add system test to verify log compaction (#5226 ) * Updated TestLogCleaning tool to use Java consumer and rename as LogCompactionTester. * Enabled the log cleaner in every system test. * Removed configs from "kafka.properties" with default values and `socket.receive.buffer.bytes` as the override did not seem necessary. * Updated `kafka.py` logic to handle duplicates between `kafka.properties` and `server_prop_overrides`. * Updated Gradle build so that classes from `kafka-clients` test jar can be used in system tests. Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Ismael Juma <ismael@juma.me.uk>	6 years ago
huxi	e7b9e04730	KAFKA-7299: Batch LeaderAndIsr requests for AutoLeaderRebalance (#5515 ) Reviewers: Jun Rao <junrao@gmail.com>	6 years ago
Rajini Sivaram	634f9af8c0	KAFKA-7119: Handle transient Kerberos errors on server side (#5509 ) Don't report retriable Kerberos errors on the server-side as authentication failures to clients. Reviewers: Jun Rao <junrao@gmail.com>	6 years ago
radai-rosenblatt	bb4cc49628	KAFKA-7019; Make reading metadata lock-free by maintaining an atomically-updated read snapshot Author: radai-rosenblatt <radai.rosenblatt@gmail.com> Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <github@juma.me.uk>, Dong Lin <lindong28@gmail.com> Closes #5221 from radai-rosenblatt/metadata-adventures	6 years ago
Jason Gustafson	7a9631e634	MINOR: Use explicit construction of clients in IntegrationTestHarness (#5443 ) Pre-initialization of clients in IntegrationTestHarness is a cause of significant confusion and has resulted in a bunch of inconsistent client creation patterns. This patch requires test cases to create needed clients explicitly and makes the creation logic more consistent. Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Rajini Sivaram	0d73351852	KAFKA-7119: Handle transient Kerberos errors as non-fatal exceptions (#5487 ) Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>	6 years ago
Rajini Sivaram	cc8fc7c449	MINOR: Clean up to avoid errors in dynamic broker config tests (#5486 ) Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Stanislav Kozlovski	76a00d78be	KAFKA-7266: Fix MetricsTest.testMetrics flakiness using compression (#5485 ) Increase record size and use compression for downconversion metrics test to ensure that conversion time is above 1ms to avoid transient test failures. Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Ismael Juma	a2bc237cef	MINOR: Remove AbstractFetcherThread.PartitionData (#5233 ) Since ConsumerFetcherThread has been removed, we have an opportunity to simplify the *FetcherThread classes. This is an unambitious first step which removes the now unneeded `PartitionData` indirection.	6 years ago
Vahid Hashemian	e6fd99dc62	KAFKA-5638; Improve the Required ACL of ListGroups API (KIP-231) (#5352 ) Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Bob Barrett	69283b4038	KAFKA-7164; Follower should truncate after every missed leader epoch change (#5436 ) Currently, we skip the steps to make a replica a follower if the leader does not change, including truncating the follower log if necessary. This can cause problems if the follower has missed one or more leader updates. Change the logic to only skip the steps if the new epoch is the same or one greater than the old epoch. Tested with unit tests that verify the behavior of `Partition` and that show log truncation when the follower's log is ahead of the leader's, the follower has missed an epoch update, and the follower receives a `LeaderAndIsrRequest` making it a follower. Reviewers: Stanislav Kozlovski <familyguyuser192@windowslive.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Viktor Somogyi	8a78d76466	KAFKA-7140; Remove deprecated poll usages (#5319 ) Reviewers: Matthias J. Sax <mjsax@apache.org>, Jason Gustafson <jason@confluent.io>	6 years ago
Dong Lin	be43e2330e	KAFKA-7147; ReassignPartitionsCommand should be able to connect to broker over SSL Author: Dong Lin <lindong28@gmail.com> Reviewers: Andras Beni <andrasbeni@cloudera.com>, Manikumar Reddy O <manikumar.reddy@gmail.com>, Sriharsha Chintalapani <sriharsha@apache.org> Closes #5355 from lindong28/KAFKA-7147	6 years ago
Manikumar Reddy O	92004fa21a	KAFKA-6751; Support dynamic configuration of max.connections.per.ip/max.connections.per.ip.overrides configs (KIP-308) (#5334 ) KIP-308 implementation. See https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Rajini Sivaram	ce19f34f1e	KAFKA-7255: Fix timing issue with create/update in SimpleAclAuthorizer (#5478 ) ACL updates currently get `(currentAcls, currentVersion)` for the resource from ZK and do a conditional update using `(currentAcls+newAcl, currentVersion)`. This supports concurrent atomic updates if the resource path already exists in ZK. If the path doesn't exist, we currently do a conditional createOrUpdate using `(newAcl, -1)`. But `-1` has a special meaning in ZooKeeper for update operations - it means match any version. So two brokers adding acls using `(newAcl1, -1)` and `(newAcl2, -1)` will result in one broker creating the path and setting newAcl1, while the other broker can potentially update the path with `(newAcl2, -1)`, losing newAcl1. The timing window is very small, but we have seen intermittent failures in `SimpleAclAuthorizerTest.testHighConcurrencyModificationOfResourceAcls` as a result of this window. This commit fixes the version used for conditional updates in ZooKeeper. It also replaces the confusing `ZkVersion.NoVersion=-1` used for `set(any-version)` and `get(return not-found)` with `ZkVersion.MatchAnyVersion` for `set(any-version)` and `ZkVersion.UnknownVersion` for `get(return not-found)` to avoid the return value from `get` matching arbitrary values in `set`.	6 years ago
Marko Stanković	b966ce127c	Fix a typo in delegation.token.expiry.time.ms docs (#5449 ) Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>	6 years ago
uncleGen	b0d840d34b	KAFKA-5928; Avoid redundant requests to zookeeper when reassign topic partition Author: uncleGen <hustyugm@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk>, Dong Lin <lindong28@gmail.com> Closes #3894 from uncleGen/KAFKA-5928	6 years ago
ying-zheng	b01f8fb668	KAFKA-7142: fix joinGroup performance issues (#5354 ) Summary: 1. Revert GroupMetadata.members to private 2. Add back a wrongly removed comment 3. In GroupMetadata.remove(), update supportedProtocols and awaitingJoinCallbackMembers, only when the remove succeeded Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Sriharsha Chintalapani <sriharsha@apache.org>	6 years ago
Jason Gustafson	fc5f6b0e46	MINOR: Add Timer to simplify timeout bookkeeping and use it in the consumer (#5087 ) We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting. This patch adds a new `Timer` class to simplify this logic and control unnecessary calls to system time. In particular, this helps with nested timeout operations. The consumer has been updated to use the new class. Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Jason Gustafson	c3e7c0bcb2	MINOR: Producers should set delivery timeout instead of retries (#5425 ) Use delivery timeout instead of retries when possible and remove various TODOs associated with completion of KIP-91. Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Lucas Wang	96bc0b882d	KAFKA-7180; Fixing the flaky test testHWCheckpointWithFailuresSingleLogSegment By waiting until server1 has joined the ISR before shutting down server2 Rerun the test method many times after the code change, and there is no flakiness any more. Author: Lucas Wang <luwang@linkedin.com> Reviewers: Mayuresh Gharat <gharatmayuresh15@gmail.com>, Dong Lin <lindong28@gmail.com> Closes #5387 from gitlw/fixing_flacky_logrecevorytest	6 years ago
Dhruvil Shah	08a4cda34e	[MINOR] Improve consumer logging on LeaveGroup (#5420 ) * Improve consumer logging on LeaveGroup * Add GroupCoordinator logging, and address review comments Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
ying-zheng	a61594dee1	KAFKA-6432: Make index lookup more cache friendly (#5346 ) KAFKA-6432: Make index lookup more cache friendly For each topic-partition, Kafka broker maintains two indices: one for message offset, one for message timestamp. By default, a new index entry is appended to each index for every 4KB messages. The lookup of the indices is a simple binary search. The indices are mmaped files, and cached by Linux page cache. Both consumer fetch and follower fetch have to do an offset lookup, before accessing the actual message data. The simple binary search algorithm used for looking up the index is not cache friendly, and may cause page faults even on high QPS topic-partitions. In a normal Kafka broker, all the follower fetch requests, and most consumer fetch requests should only look up the last few entries of the index. We can make the index lookup more cache friendly, by searching in the last one or two pages of the index first. Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Guozhang Wang <wangguoz@gmail.com>, Ted Yu <yuzhihong@gmail.com>, Ismael Juma <github@juma.me.uk>, Sriharsha Chintalapani <sriharsha@apache.org>	6 years ago
Yu Yang	7fc7136ffd	KAFKA-5886; Introduce delivery.timeout.ms producer config (KIP-91) (#5270 ) Co-authored-by: Sumant Tambe <sutambe@yahoo.com> Co-authored-by: Yu Yang <yuyang@pinterest.com> Reviewers: Ted Yu <yuzhihong@gmail.com>, Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>	6 years ago
Manikumar Reddy O	52c5b5f111	MINOR: Remove unused TopicAndPartition usage in tests (#5419 ) Also replace `TopicAndPartition` with `TopicPartition` in `MetadataCache`. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Manikumar Reddy O	5db2f9903a	MINOR: Close ZooKeeperClient if waitUntilConnected fails during construction (#5411 ) This has always been an issue, but the recent upgrade to ZooKeeper 3.4.13 means it is also an issue when an unresolvable ZK address is used, causing some tests to leak threads. The change in behaviour in ZK 3.4.13 is that no exception is thrown from the ZooKeeper constructor in case of an unresolvable address. Instead, ZooKeeper tries to re-resolve the address hoping it becomes resolvable again. We eventually throw a `ZooKeeperClientTimeoutException`, which is similar to the case where the the address is resolvable but ZooKeeper is not reachable. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Zhanxiang (Patrick) Huang	9a7f29c1ed	KAFKA-7152; Avoid moving a replica out of isr if its LEO equals leader's LEO When there are many inactive partitions in the cluster, we observed constant churn of URP in the cluster even if follower can catch up with leader's byte-in-rate because leader broker frequently moves replicas of inactive partitions out of ISR. This PR mitigates this issue by not moving replica out of ISR if follower's LEO == leader's LEO. Author: Zhanxiang (Patrick) Huang <hzxa21@hotmail.com> Reviewers: Dong Lin <lindong28@gmail.com> Closes #5412 from hzxa21/KAFKA-7152	6 years ago
Dhruvil Shah	d11f6f26b7	KAFKA-6897; Prevent KafkaProducer.send from blocking when producer is closed (#5027 ) After successful completion of KafkaProducer#close, it is possible that an application calls KafkaProducer#send. If the send is invoked for a topic for which we do not have any metadata, the producer will block until `max.block.ms` elapses - we do not expect to receive any metadata update in this case because Sender (and NetworkClient) has already exited. It is only when RecordAccumulator#append is invoked that we notice that the producer has already been closed and throw an exception. If `max.block.ms` is set to Long.MaxValue (or a sufficiently high value in general), the producer could block awaiting metadata indefinitely. This patch makes sure `Metadata#awaitUpdate` periodically checks if the network client has been closed, and if so bails out as soon as possible.	6 years ago
Sandor Murakozi	591954e2e5	MINOR: Add registerController method to KafkaZkClient (#4598 ) And change KafkaController to use the newly introduced method. Also remove redundant `InZk` postfixes from `registerBrokerInZk` and `updateBrokerInfoInZk`. As `checkedEphemeralCreate` is not used outside of `KafkaZkClient` any longer, reduce its visibility. ControllerIntegrationTest already covers this functionality well, it validates the refactor. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Colin Patrick McCabe	b9b70c95a2	MINOR: Change "no such session ID" log to debug (#5316 ) Improve the log messages while at it and fix some code style issues. Reviewers: Ismael Juma <ismael@juma.me.uk>	6 years ago
Dhruvil Shah	9449f055c7	KAFKA-7185: Allow empty resource name when matching ACLs (#5400 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
huxi	bf237fa7c5	KAFKA-7141; Consumer group describe should include groups with no committed offsets (#5356 ) Currently, if a consumer group never commits offsets, ConsumerGroupCommand will not include it in the describe output even if the member assignment is valid. Instead, the tool should be able to describe the group information showing empty current_offset and LAG. Reviewers: Sriharsha Chintalapani <sriharsha@apache.org>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Zhanxiang (Patrick) Huang	80b55309d1	KAFKA-7098; Improve accuracy of throttling by avoiding under-estimating actual rate in Throttler Author: Zhanxiang (Patrick) Huang <hzxa21@hotmail.com> Reviewers: Dong Lin <lindong28@gmail.com> Closes #5350 from hzxa21/KAFKA-7098	6 years ago
Manikumar Reddy	7c9a7359dc	MINOR: Consolidate Topic create calls in Test classes - Replace adminZkClient.createOrUpdateTopicPartitionAssignmentPathInZK calls with TestUtils.createTopic wherever applicable - Replace adminZkClient.createTopic calls with TestUtils.createTopic wherever applicable - Move non-deprecated tests to other test classes and deprecate AdminTest.scala - Remove duplicate tests between AdminTest and AdminZkClientTest Author: Manikumar Reddy <manikumar.reddy@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk>, Dong Lin <lindong28@gmail.com> Closes #5303 from omkreddy/topiccreate	6 years ago
Lee Dongjin	0c0ced770c	MINOR: Fix broken Javadoc on [AbstractIndex\|OffsetIndex] (#5370 ) In the javadoc of `AbstractIndex` and `OffsetIndex`, thrown `Exception`s are not imported.	6 years ago
Ismael Juma	f6219c6ad1	KAFKA-4041: Update ZooKeeper to 3.4.13 (#5376 ) This includes a fix for ZOOKEEPER-2184 (Zookeeper Client should re-resolve hosts when connection attempts fail), which fixes KAFKA-4041. Updated a couple of tests as unresolvable addresses are now retried until the connection timeout. Cleaned up tests a little. Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Attila Sasvari	8ec8ec5422	KAFKA-6884; Consumer group command should use new admin client (#5032 ) Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Mickael Maison	b68766017f	MINOR: Additional detail in description for zookeeper.connect (#5358 ) This setting allows specifying a chroot so we documented it. Co-authored-by: Mickael Maison <mickael.maison@gmail.com> Co-authored-by: Katherine Farmer <kfarme3@uk.ibm.com> Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>	6 years ago
Aviem Zur	000a2d42cb	MINOR: Print exception stack traces in ConsumerGroupCommand. (#5286 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Ismael Juma	f54ba7cf85	MINOR: Use FetchRequest v8 and ListOffsetRequest v3 in ReplicaFetcherThread (#5342 ) If inter.broker.protocol.version is 2.0-IV1 or newer. Also fixed ListOffsetRequest so that v2 is used, if applicable. Added a unit test which verifies that we use the latest version of the various requests by default. Included a few minor tweaks to make testing easier. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Rajini Sivaram	0db8074d49	MINOR: Close timing window in SimpleAclAuthorizer startup (#5318 ) ZooKeeper listener for change notifications should be created before loading the ACL cache to avoid timing window if acls are modified when broker is starting up. Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@confluent.io>	6 years ago
Stanislav Kozlovski	b0d2ddb330	KAFKA-7028: Properly authorize custom principal objects (#5311 ) Use KafkaPrincipal objects for authorization in `SimpleAclAuthorizer` so that comparison with super.users and ACLs instantiated from Strings work. Previously, it would compare two different classes `KafkaPrincipal` and the custom class, which would always return false because of the implementation of `KafkaPrincipal#equals`.	6 years ago
Anna Povzner	10b84a3661	KAFKA-7104: More consistent leader's state in fetch response (#5305 ) Do not update LogReadResult after it is initially populated when returning fetches immediately (i.e. without hitting the purgatory). This was done in #3954 as an optimization so that the followers get the potentially updated high watermark. However, since many things can happen (like deleting old segments and advancing log start offset) between initial creation of LogReadResult and the update, we can hit issues like log start offset in fetch response being higher than the last offset in fetched records. Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>	6 years ago
Manikumar Reddy O	51935ee2e6	KAFKA-7091; AdminClient should handle FindCoordinatorResponse errors (#5278 ) - Update KafkaAdminClient implementation to handle FindCoordinatorResponse errors - Remove scala AdminClient usage from core and streams tests Reviewers: Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>	6 years ago
Dhruvil Shah	2db7eb7a8c	KAFKA-7076; Skip rebuilding producer state when using old message format (#5254 ) This patch removes the need to build up producer state when the log is using V0 / V1 message format where we did not have idempotent and transactional producers yet. Also fixes a small issue where we incorrectly reported the offset index corrupt if the last offset in the index is equal to the base offset of the segment.	6 years ago
Ismael Juma	7a74ec62d2	MINOR: Avoid FileInputStream/FileOutputStream (#5281 ) They rely on finalizers (before Java 11), which create unnecessary GC load. The alternatives are as easy to use and don't have this issue. Also use FileChannel directly instead of retrieving it from RandomAccessFile whenever possible since the indirection is unnecessary. Finally, add a few try/finally blocks. Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
Vahid Hashemian	9bc9a37e50	MINOR: KIP-211 Follow-up (#5272 ) Updates the description of `offsets.retention.minutes` config, and fixes an upgrade note.	6 years ago
Dong Lin	9ea81baf34	KAFKA-6949; alterReplicaLogDirs() should grab partition lock when accessing log of the future replica NoSuchElementException will be thrown if ReplicaAlterDirThread replaces the current replica with future replica right before the request handler thread executes `futureReplica.log.get.dir.getParent` in the ReplicaManager.alterReplicaLogDirs(). The solution is to grab the partition lock when request handler thread attempts to check the destination log directory of the future replica. Author: Dong Lin <lindong28@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #5081 from lindong28/KAFKA-6949	6 years ago

... 5 6 7 8 9 ...

2611 Commits (c758122ce59674ec3e33618d896e4e5cdbb45e87)