src-kafka

Commit Graph

Author	SHA1	Message	Date
Jason Gustafson	2ac78ff621	MINOR: Propagate LogContext to channel builders and SASL authenticator (#7867 ) The log context is useful when debugging applications which have multiple clients. This patch propagates the context to the channel builders and the SASL authenticator. Reviewers: Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Ismael Juma	6dc6f6a60d	KAFKA-9324: Drop support for Scala 2.11 (KIP-531) (#7859 ) * Adjust build and documentation. * Use lambda syntax for SAM types in `core`, `streams-scala` and `connect-runtime` modules. * Remove `runnable` and `newThread` from `CoreUtils` as lambda syntax for SAM types make them unnecessary. * Remove stale comment in `FunctionsCompatConversions`, `KGroupedStream`, `KGroupedTable' and `KStream` about Scala 2.11, the conversions are needed for Scala 2.12 too. * Deprecate `org.apache.kafka.streams.scala.kstream.Suppressed` and use `org.apache.kafka.streams.kstream.Suppressed` instead. * Use `Admin.create` instead of `AdminClient.create`. Static methods in Java interfaces can be invoked since Scala 2.12. I noticed that MirrorMaker 2 uses `AdminClient.create`, but I did not change them as Connectors have restrictions on newer client APIs. * Improve efficiency in a few `Gauge` implementations by avoiding unnecessary intermediate collections. * Remove pointless `Option.apply` in `ZookeeperClient` `SessionState` metric. * Fix unused import/variable and other compiler warnings. * Reduce visibility of some vals/defs. Reviewers: Manikumar Reddy <manikumar@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Gwen Shapira <gwen@confluent.io>	5 years ago
Jason Gustafson	bd3bde7623	MINOR: Fix failing test case in TransactionLogTest (#7895 ) This patch fixes a brittle expectation on the `toString` implementation coming from `Set`. This was failing on jenkins with the following error: ``` java.lang.AssertionError: expected:<Some(producerId:1334,producerEpoch:0,state=Ongoing,partitions=Set(topic-0),txnLastUpdateTimestamp=0,txnTimeoutMs=1000)> but was:<Some(producerId:1334,producerEpoch:0,state=Ongoing,partitions=HashSet(topic-0),txnLastUpdateTimestamp=0,txnTimeoutMs=1000)> ``` Instead we convert the collection to a string directly. Reviewers: Boyang Chen <boyang@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Dae-Ho Kim	933b982f48	MINOR: Remove compilation warnings (#7888 ) * Warnings 1. `kafka/core/src/test/scala/integration/kafka/server/DelayedFetchTest.scala:110: local val partition in method testReplicaNotAvailable is never used` 2. `kafka/core/src/test/scala/unit/kafka/admin/ConfigCommandTest.scala:527: local val alterResourceName in method verifyAlterBrokerConfig is never used` Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Jason Gustafson	a094504267	KAFKA-9293; Fix NPE in DumpLogSegments offsets parser and display tombstone keys (#7820 ) Fixes an NPE when UserData in a member's subscription is null and adds similar checks for transaction log parser. Also modifies the output logic so that we show the keys of tombstones for both group and transaction metadata. Reviewers: David Arthur <mumrah@gmail.com>	5 years ago
Jason Gustafson	f610f9ff1f	MINOR: Remove spammy, unhelpful log message in the controller (#7879 ) This patch removes a spammy log message in the controller which is printed every time the leader imbalance ratio is checked. It is unhelpful because preferred leaders are generally deterministic and is spammy because it includes _every_ partition in the cluster. Reviewers: Jonathan Santilli <jonathansantilli@users.noreply.github.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Mickael Maison	b2b6b5c65a	MINOR: Add missing space in isolation-level doc (#7473 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Boyang Chen <boyang@confluent.io>	5 years ago
huxi	6d486eddb5	KAFKA-9202: serde in ConsoleConsumer with access to headers (#7736 ) The Deserializer interface has two methods, one that gives access to the headers and one that does not. ConsoleConsumer.scala only calls the latter method. It would be nice if it were to call the default method that provides header access, so that custom serde that depends on headers becomes possible. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
A. Sophie Blee-Goldman	6b6b36cec0	KAFKA-9232; Coordinator new member timeout does not work for JoinGroup v3 and below (#7753 ) The v3 JoinGroup logic does not properly complete the initial heartbeat for new members, which then expires after the static 5 minute timeout if the member does not rejoin. The core issue is in the `shouldKeepAlive` method, which returns false when it should return true because of an inconsistent timeout check. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
dengziming	7e71b1ac48	KAFKA-9277:move all group state transition rules into their states (#7787 ) Similar to KAFKA-5258 which move all partition and replica state transition rules into their states, we move the group state transition rules into their states. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Dhruvil Shah	0f0eb49e35	KAFKA-9307; Make transaction metadata loading resilient to previous errors (#7840 ) Allow transaction metadata to be reloaded, even if it already exists as of a previous epoch. This helps with cases where a previous become-follower transition failed to unload corresponding metadata. Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	5 years ago
Brian Byrne	7f35a67134	KIP-543: Expand ConfigCommand's non-ZK functionality (#7780 ) Allow ConfigCommand to handle more operations without using direct ZooKeeper access, as described by KIP-543. Also allow specifying entity type and name via a single flag-- again, as the KIP describes. Reviewers: Colin P. McCabe <cmccabe@apache.org>	5 years ago
Rajini Sivaram	e275742f85	KAFKA-7251; Add support for TLS 1.3 (#7804 ) Adds support for TLSv1.3 in SslTransportLayer. Note that TLSv1.3 is only enabled from Java 11 onwards, so we test the code only when running with Java11 and above. Tests run on this PR: - SslTransportLayerTest: This covers testing of our SslTransportLayer and all tests are run with TLSv1.3 when running with Java 11. These tests are also run with TLSv1.2 for all Java versions. - SslFactoryTest: Also run with TLSv1.3 on Java 11 onwards in addition to TLSv1.2 for all Java versions. - SslEndToEndAuthorizationTest - Run only with TLSv1.3 on Java 11 onwards and only with TLSv1.2 on earlier Java versions. We have other versions of this test which use SSL that continue to be with TLSv1.2 on Java 11 to avoid reducing test coverage for TLSv1.2 Additional testing for done for TLSv1.3: - Most tests that use SSL use TestSslUtils.DEFAULT_TLS_PROTOCOL_FOR_TESTS which is set to TLSv1.2. I have run all clients and core tests with DEFAULT_TLS_PROTOCOL_FOR_TESTS=TLSv1.3 with Java 11. - Ran a few system tests locally with TKSv1.3 Reviewers: Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
huxihx	714eea9190	KAFKA-9316: Update command line option 'property' description in ConsoleProducer Author: huxihx <huxi_2b@hotmail.com> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7854 from huxihx/KAFKA-9316	5 years ago
Lee Dongjin	8c64aa080a	MINOR: trivial cleanups - Reformat header: `CustomDeserializerTest`, `ReplicaVerificationToolTest` - Remove unused constructor: `ConsumerGroupDescription` - Remove unused variables in `TimeOrderedKeyValueBufferTest#shouldRestoreV2Format` - Remove deprecated `Number` consturctor calls; use `Number#valueOf` instread. Author: Lee Dongjin <dongjin@apache.org> Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7202 from dongjinleekr/cleanup/201908	5 years ago
Jason Gustafson	fd7991ae23	MINOR: Fix throttle usage in reassignment test case (#7822 ) The replication throttle in `testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress` was not setting the quota on the partition correctly, so the test case was not working as expected. This patch fixes the problem and also fixes a bug in `waitForTopicCreated` which caused the function to always wait for the full timeout. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Ismael Juma	16d36f1674	MINOR: Adjust `testClientDisconnectionUpdatesRequestMetrics` to also test small response case (#7754 ) Reviewers: Jun Rao <junrao@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>, Andrew Choi <andchoi@linkedin.com>	5 years ago
David Jacot	9fefe6aa6c	KAFKA-9297: CreateTopics API does not work with older version of the request/response (#7829 ) The create topic api do not work with older version of the api. It can be reproduced by trying to create a topic with kafka-topics.sh from 2.3. It timeouts. b94c7f4 has added a check which raises an exception if a field has been set to a non-default value unless the field is marked as "ignorable". The fields added in the version 5 of the response are always set regardless of the version used by the client. If an older version is used, an exception is thrown during the serialization because the fields have non-default values. We should either not set the fields for older versions in the api layer or mark them as ignorable. I have chosen the later in this case because it looks cleaner. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Mickael Maison <mickael.maison@gmail.com>	5 years ago
David Jacot	ae38315895	KAFKA-8855; Collect and Expose Client's Name and Version in the Brokers (KIP-511 Part 2) (#7749 ) Collect and expose the KIP-511 client name and version information the clients now provide to the server as part of ApiVersionsRequests. Also refactor how we handle selector metrics by creating a ChannelMetadataRegistry class. This will make it easier for various parts of the networking code to modify channel metrics. Reviewers: Colin P. McCabe <cmccabe@apache.org>	5 years ago
huxihx	72df28fe8c	KAFKA-9025: Add a option for path existence check in ZkSecurityMigrator https://issues.apache.org/jira/browse/KAFKA-9025 If a chroot is configured, ZkSecurityMigrator should prompt a confirm to user to ensure whether chroot is specified correctly. Author: huxihx <huxi_2b@hotmail.com> Author: huxi <huxi_2b@hotmail.com> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7618 from huxihx/KAFKA-9025	5 years ago
Ron Dagostino	648497a5e5	KAFKA-9241: Some SASL Clients not forced to re-authenticate (#7784 ) Brokers are supposed to force SASL clients to re-authenticate (and kill such connections in the absence of a timely and successful re-authentication) when KIP-368 SASL Re-Authentication is enabled via a positive connections.max.reauth.ms configuration value. There was a flaw in the logic that caused connections to not be killed in the absence of a timely and successful re-authentication if the client did not leverage the SaslAuthenticateRequest API (which was defined in KIP-152). Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Rajini Sivaram	d2d6838017	MINOR: Test for non-blocking send using max.block.ms=0 (#7370 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Jason Gustafson	5d0cb1419c	KAFKA-9212; Ensure LeaderAndIsr state updated in controller context during reassignment (#7795 ) KIP-320 improved fetch semantics by adding leader epoch validation. This relies on reliable propagation of leader epoch information from the controller. Unfortunately, we have encountered a bug during partition reassignment in which the leader epoch in the controller context does not get properly updated. This causes UpdateMetadata requests to be sent with stale epoch information which results in the metadata caches on the brokers falling out of sync. This bug has existed for a long time, but it is only a problem due to the new epoch validation done by the client. Because the client includes the stale leader epoch in its requests, the leader rejects them, yet the stale metadata cache on the brokers prevents the consumer from getting the latest epoch. Hence the consumer cannot make progress while a reassignment is ongoing. Although it is straightforward to fix this problem in the controller for the new releases (which this patch does), it is not so easy to fix older brokers which means new clients could still encounter brokers with this bug. To address this problem, this patch also modifies the client to treat the leader epoch returned from the Metadata response as "unreliable" if it comes from an older version of the protocol. The client in this case will discard the returned epoch and it won't be included in any requests. Also, note that the correct epoch is still forwarded to replicas correctly in the LeaderAndIsr request, so this bug does not affect replication. Reviewers: Jun Rao <junrao@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
NanerLee	8e12c3eda6	KAFKA-9267: ZkSecurityMigrator should not create /controller node [KAFKA-9267](https://issues.apache.org/jira/browse/KAFKA-9267) ZkSecurityMigrator might create a PERSISTENT /controller node with null data, it will lead to controller can't elect. More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Author: NanerLee <nanerlee@qq.com> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7778 from NanerLee/fix-ZkSecurityMigrator	5 years ago
Jason Gustafson	1d496a26c9	KAFKA-9179; Fix flaky test due to race condition when fetching reassignment state (#7786 ) This patch fixes a race condition on reassignment completion. The previous code fetched metadata first and then fetched the reassignment state. It is possible in between those times for the reassignment to complete, which leads to spurious URPs being reported. The fix here is to change the order of these checks and to explicitly check for reassignment completion. Note this patch fixes the flaky test `TopicCommandWithAdminClientTest.testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress`. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Jason Gustafson	eaccb92929	MINOR: Cleanup redundancies in BaseRequestTest (#7735 ) This patch eliminates some redundancy and general messiness around the usage of `BaseRequestTest` and specifically response deserialization. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Jason Gustafson	5b6de9f2d0	KAFKA-8933; Fix NPE in DefaultMetadataUpdater after authentication failure (#7682 ) This patch fixes an NPE in `DefaultMetadataUpdater` due to an inconsistency in event expectations. Whenever there is an authentication failure, we were treating it as a failed update even if was from a separate connection from an inflight metadata request. This patch fixes the problem by making the `MetadataUpdater` api clearer in terms of the events that are handled. Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Brian Byrne	95581f33f3	MINOR: Adds entity-specific flags to ConfigCommand per KIP-543. (#7667 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	5 years ago
Ron Dagostino	0871f7b735	KAFKA-9190; Close connections with expired authentication sessions (#7723 ) This patch fixes a bug in `SocketServer` in the expiration of connections which have not re-authenticated quickly enough. Previously these connections were left hanging, but now they are properly closed and cleaned up. This was one cause of the flaky test failures in `EndToEndAuthorizationTest.testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl`. Reviewers: Jason Gustafson<jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Vikas Singh	2accf14ccf	KAFKA-9265: Fix kafka.log.Log instance leak on log deletion (#7773 ) KAFKA-8448 fixes problem with similar leak. The Log objects are being held in ScheduledExecutor PeriodicProducerExpirationCheck callback. The fix in KAFKA-8448 was to change the policy of ScheduledExecutor to remove the scheduled task when it gets canceled (by calling setRemoveOnCancelPolicy(true)). This works when a log is closed using close() method. But when a log is deleted either when the topic gets deleted or when the rebalancing operation moves the replica away from broker, the delete() operation is invoked. Log.delete() doesn't close the pending scheduled task and that leaks Log instance. Fix is to close the scheduled task in the Log.delete() method too. Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Ismael Juma	95b6f42e00	MINOR: Remove logic conditional on exception messages from LogValidator (#7744 ) Such logic is very brittle. Take the chance to simplify the code a bit. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Dhruvil Shah	122941793e	MINOR: Add test to ensure log metrics are removed after deletion (#7750 ) Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Alex Mironov	075f583d95	KAFKA-9156: Fix LazyTimeIndex & LazyOffsetIndex concurrency (#7760 ) Race condition in concurrent `get` method invocation of lazy indexes might lead to multiple `OffsetIndex`/`TimeIndex` objects being concurrently created. When that happens position of `MappedByteBuffer` in `AbstractIndex` is advanced to the last entry which in turn leads to a critical `BufferOverflowException` being thrown whenever leader or replica tries to append a record to the segment. Moreover, `file_=` setter is seemingly also vulnerable to the race, since multiple threads can attempt to set a new file reference as well as create new Time/OffsetIndex objects at the same time. This might lead to the discrepant File references being held by e.g. LazyTimeIndex and its TimeIndex counterpart. This patch attempts to fix the issue by making sure that index objects are atomically constructed when loaded lazily via `get` method. Additionally, `file` reference modifications are also made atomic and thread safe. Note that the `Lazy*Index` mutation operations are executed with a lock held by the callers, but `get` can be called without a lock (e.g. from `Log.read`). Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>, Shilin Lu, Ismael Juma <ismael@juma.me.uk>	5 years ago
huxihx	9dbca4c336	KAFKA-9069: Flaky Test AdminClientIntegrationTest.testCreatePartitions https://issues.apache.org/jira/browse/KAFKA-9069 Make `getTopicMetadata` in AdminClientIntegrationTest always read metadata from controller to get a consistent view. More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Author: huxihx <huxi_2b@hotmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com>, José Armando García Sancio <jsancio@gmail.com> Closes #7619 from huxihx/KAFKA-9069	5 years ago
Jason Gustafson	b94c7f479b	MINOR: Add ignorable field check to `toStruct` and fix usage (#7710 ) If a field is not marked as ignorable, we should raise an exception if it has been set to a non-default value. This check already exists in `Message.write`, so this patch adds it to `Message.toStruct`. Additionally, we fix several fields which should have been marked ignorable and we fix some related test assertions. Reviewers: Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy <manikumar.reddy@gmail.com>, Colin Patrick McCabe <cmccabe@apache.org>	5 years ago
Jason Gustafson	054f2f1e8b	MINOR: Fix producer timeouts in log divergence test (#7728 ) This test was taking more than 5 minutes because the producer writes were timing out. The problem was that broker 100 was being shutdown before broker 101, which meant that the partition was still offline after broker 100 was restarted. The producer timeouts were not detected because the produce future was not checked. After the fix, test time drops to about 15s. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Jason Gustafson	f9fc53ea28	MINOR: Controller should process events without rate metrics (#7732 ) Fixes #7717, which did not actually achieve its intended effect. The event manager failed to process the new event because we disabled the rate metric, which it expected to be present. Reviewers: Ismael Juma <ismael@juma.me.uk	5 years ago
Jason Gustafson	4e431246c3	MINOR: Controller should log UpdateMetadata response errors (#7717 ) Create a controller event for handling UpdateMetadata responses and log a message when a response contains an error. Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Ismael Juma	2b2de2ddc6	MINOR: Use IntegrationTestHarness properly in BaseAdminIntegrationTest (#7705 ) The latter was previously hardcoding logDirCount instead of using the method defined in the superclass since it was unnecessarily duplicating logic. Also tweak IntegrationTestHarness and remove unnecessary method override from SaslPlainPlaintextConsumerTest. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Jason Gustafson	6ae1af842a	KAFKA-9198; Complete purgatory operations on receiving StopReplica (#7701 ) Force completion of delayed operations when receiving a StopReplica request. In the case of a partition reassignment, we cannot rely on receiving a LeaderAndIsr request in order to complete these operations because the leader may no longer be a replica. Previously when this happened, the delayed operations were left to eventually timeout. Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk> Co-Authored-By: Kun Du <kidkun@users.noreply.github.com>	5 years ago
Ismael Juma	f98d935b3e	KAFKA-9180: Introduce BrokerMetadataCheckpointTest (#7700 ) While investigating KAFKA-9180, I noticed that we had no unit test coverage. It turns out that the behavior was correct, so we just fix the test coverage issue. Also updated .gitignore with jmh-benchmarks/generated. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Jason Gustafson	7eafea0d57	KAFKA-9196; Update high watermark metadata after segment roll (#7695 ) When we roll a new segment, the log offset metadata tied to the high watermark may need to be updated. This is needed when the high watermark is equal to the log end offset at the time of the roll. Otherwise, we risk exposing uncommitted data early. Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Jason Gustafson	ca7a5eae89	HOTFIX: Fix infinite loop in AbstractIndex.indexSlotRangeFor (#7702 ) Fixes regression from #5378 which causing an infinite loop in `binarySearch`. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Colin Patrick McCabe	cf84f244e5	MINOR: More efficient midpoint calc for AbstractIndex (#5378 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Jason Gustafson	400185421f	KAFKA-9183; Remove redundant admin client integration testing (#7690 ) This patch creates a `BaseAdminIntegrationTest` to be the root for all integration test extensions. Most of the existing tests will only be tested in `PlaintextAdminIntegrationTest`, which extends from `BaseAdminIntegrationTest`. This should cut off about 30 minutes from the overall build time. Reviewers: David Arthur <mumrah@gmail.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Lucas Bradstreet	2bd6b6ff0f	MINOR: fix flaky ConsumerBounceTest.testClose Fixes `java.util.concurrent.ExecutionException: java.lang.AssertionError: Close finished too quickly 5999`. The close test sets a close duration in milliseconds, but measures the time taken in nanoseconds. This leads to small error due to the resolution in each, where the close is deemed to have taken too little time. When I measured the start and end with nanoTime, I found the time taken to close was `5999641566 ns (5999.6ms)` which seems close enough to be a resolution error. I've run the test 50 times and have not hit the "Close finished too quickly" issue again, whereas previously I hit a failure pretty quickly. Author: Lucas Bradstreet <lucas@confluent.io> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes #7683 from lbradstreet/flaky-consumer-bounce-test	5 years ago
Bob Barrett	fecb977b25	KAFKA-8710; Allow transactional producers to bump producer epoch [KIP-360] (#7115 ) This patch implements the broker-side changes for KIP-360. It adds two new fields to InitProducerId: lastEpoch and producerId. Passing these values allows the TransactionCoordinator to safely bump a producer's epoch after some failures (such as UNKNOWN_PRODUCER_ID and INVALID_PRODUCER_ID_MAPPING). When a producer calls InitProducerId after a failure, the coordinator first checks the producer ID from the request to make sure no other producer has been started using the same transactional ID. If it is safe to continue, the coordinator checks the epoch from the request; if it matches the existing epoch, the epoch is bumped and the producer can safely continue. If it matches the previous epoch, the the current epoch is returned without bumping. Otherwise, the producer is fenced. Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>	5 years ago
Lucas Bradstreet	1675115ec1	MINOR: refactor replica last sent HW updates due to performance regression (#7671 ) This change fixes a performance regression due to follower last seen highwatermark handling introduced in 23beeea. maybeUpdateHwAndSendResponse is expensive for brokers with high partition counts, as it requires a partition and a replica lookup for every partition being fetched. This refactor moves the last seen watermark update into the follower fetch state update where we have already looked up the partition and replica. I've seen cases where maybeUpdateHwAndSendResponse is responsible 8% of CPU usage, not including the responseCallback call that is part of it. I have benchmarked this change with `UpdateFollowerFetchStateBenchmark` and it adds 5ns of overhead to Partition.updateFollowerFetchState, which is a rounding error compared to the current overhead of maybeUpdateHwAndSendResponse. Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Rajini Sivaram	6f0008643d	KAFKA-9171: Handle ReplicaNotAvailableException during DelayedFetch (#7678 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Rajini Sivaram	f15d318aaa	MINOR: Change topic-exists log for CreateTopics from INFO to DEBUG (#7666 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago

1 2 3 4 5 ...

2817 Commits (1513c817d4437300865b06bde2ae33210ae05ff9)