src-kafka

Commit Graph

Author	SHA1	Message	Date
John Roesler	0f8dc1fcd7	KAFKA-6145: KIP-441: Add test scenarios to ensure rebalance convergence (#8475 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>	5 years ago
Bruno Cadonna	a0173ec45d	KAFKA-9881: Convert integration test to verify measurements from RocksDB to unit test (#8501 ) The integration test RocksDBMetricsIntegrationTest takes pretty long to complete. Most of the runtime is spent in the two tests that verify whether the RocksDB metrics get actual measurements from RocksDB. Those tests need to wait for the thread that collects the measurements of the RocksDB metrics to trigger the first recordings of the metrics. This PR adds a unit test that verifies whether the Kafka Streams metrics get the measurements from RocksDB and removes the two integration tests that verified it before. The verification of the creation and scheduling of the RocksDB metrics recording trigger thread is already contained in KafkaStreamsTest and consequently it is not part of this PR. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Lucas Bradstreet	00a59b392d	MINOR: improve test coverage for dynamic LogConfig(s) (#7616 ) Adding a dynamically updatable log config is currently error prone, as it is easy to set them up as a val not a def and this would result in a dynamically updated broker default not applying to a LogConfig after broker restart. This PR adds a guard against introducing these issues by ensuring that all log configs are exhaustively checked via a test. For example, if the following line was a val and not a def, there would be a problem with dynamically updating broker defaults for the config. `4bde9bb3cc/core/src/main/scala/kafka/server/KafkaConfig.scala (L1216)` Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Sönke Liebau	cc4e3aa302	MINOR: Switch order of sections on tumbling and hopping windows in streams doc. Tumbling windows are defined as "special case of hopping time windows" - but hopping windows currently only explained later in the docs. (#8505 ) Currently, tumbling windows are defined as "a special case of hopping time windows" in the streams docs, but hopping windows are only explained in a subsequent section. I think it would make sense to switch the order of these paragraphs around. To me this also makes more sense semantically. Testing Built the site and checked that everything looks ok and html is valid (or at least didn't contain any new warnings that were caused by this change). Reviewers: Bill Bejeck <bbejeck@apache.org>	5 years ago
Matthias J. Sax	770b095e91	KAFKA-9819: Fix flaky test in StoreChangelogReaderTest (#8488 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>	5 years ago
Guozhang Wang	a0119aa859	HOTFIX: fix active task process ratio metric recording	5 years ago
David Jacot	9a36d9f913	KAFKA-9796; Ensure broker shutdown is not stuck when Acceptor is waiting on connection queue (#8448 ) This commit reworks the SocketServer to always start the acceptor threads after the processor threads and to always stop the acceptor threads before the processor threads. It ensures that the acceptor shutdown is not blocked waiting on the processors to be fully shutdown by decoupling the shutdown signal and the awaiting. It also ensure that the processor threads drain its newConnection queue to unblock acceptors that may be waiting. However, the acceptors still bind during the startup, only the processing of new connections and requests is further delayed. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Ismael Juma	0e46dd4a1d	MINOR: Use streaming iterator with decompression buffer when building offset map (#8494 ) This makes it consistent with the `filterTo` methods.	5 years ago
David Arthur	03f812aa4a	Add log message in release.py (#8461 ) When building a release candidate with release.py, if it's not the first RC, we need to drop the previous RC's artifacts from the staging repository before closing the new ones. This adds a log message to remind the release manager of this	5 years ago
Chia-Ping Tsai	93123720fc	KAFKA-9854 Re-authenticating causes mismatched parse of response (#8471 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ron Dagostino <rdagostino@confluent.io>	5 years ago
Jason Gustafson	413c4b55b5	KAFKA-9838; Add log concurrency test and fix minor race condition (#8476 ) The patch adds a new test case for validating concurrent read/write behavior in the `Log` implementation. In the process of verifying this, we found a race condition in `read`. The previous logic checks whether the start offset is equal to the end offset before collecting the high watermark. It is possible that the log is truncated in between these two conditions which could cause the high watermark to be equal to the log end offset. When this happens, `LogSegment.read` fails because it is unable to find the starting position to read from. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Jiamei Xie	c85fd07bd1	KAFKA-9703; Free up compression buffer after splitting a large batch Method split takes up too many resources and might cause outOfMemory error when the bigBatch is huge. Call closeForRecordAppends() to free up resources like compression buffers. Change-Id: Iac6519fcc2e432330b8af2d9f68a8d4d4a07646b Signed-off-by: Jiamei Xie <jiamei.xiearm.com> More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Author: Jiamei Xie <jiamei.xie@arm.com> Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jiangjie (Becket) Qin <becket.qin@gmail.com> Closes #8286 from jiameixie/outOfMemory	5 years ago
Boyang Chen	df41713d64	KAFKA-9779: Add Stream system test for 2.5 release (#8378 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	5 years ago
Piotr Fras	f7d2b1baf7	KAFKA-7885: TopologyDescription violates equals-hashCode contract. (#6210 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	5 years ago
David Jacot	f646c9c0bc	MINOR: KafkaApis#handleOffsetDeleteRequest does not group result correctly (#8485 ) `KafkaApis#handleOffsetDeleteRequest` does not build the response correctly because `topics.add` is not in the correct loop. Fortunately, due to how the response is processed by the admin client, it works but sends redundant information on the wire. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	5 years ago
A. Sophie Blee-Goldman	640be46ef5	HOTFIX: don't close or wipe out someone else's state (#8478 ) When it comes to actually closing a task we now treat all states exactly the same, and call StateManagerUtil#closeStateManager regardless of whether it's in CREATED or RESTORING or RUNNING Unfortunately StateManagerUtil doesn't actually check to make sure that we actually own the lock for this task's state. During a dirty close with eos enabled, we wipe the state -- but in some cases, this means deleting the state out from under another StreamThread who is still in the process of revoking this task. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	1263576b60	MINOR: add process(Test)Messages to the README (#8480 )	5 years ago
Matthias J. Sax	17f9879261	KAFKA-9832: extend Kafka Streams EOS system test (#8440 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
Stanislav Kozlovski	bd427346a4	MINOR: Serialize state change logs for handling LeaderAndIsr and StopReplica requests (#8493 ) This patch moves the state change logger logs for handling a LeaderAndIsr/StopReplica request inside the replicaStateChangeLock in order to serialize the logs. This helps to tell apart per-partition actions of concurrent LAIR/StopReplica requests in cases where requests pile up waiting on the lock. Reviewer: Jun Rao <junrao@gmail.com>	5 years ago
Lucas Bradstreet	0a5097323b	KAFKA-9864: Avoid expensive QuotaViolationException usage (#8477 ) QuotaViolationException generates an exception message via String.format in the constructor even though the message is often not used, e.g. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/ClientQuotaManager.scala#L258. We now override `toString` instead. It also generates an unnecessary stack trace, which is now avoided using the same pattern as in ApiException. I have also avoided use of QuotaViolationException for control flow in ReplicationQuotaManager which is another hotspot that we have seen in practice. Reviewers: Gwen Shapira <gwen@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Lucas Bradstreet	4ac2ad3a2b	MINOR: Eliminate unnecessary partition lookups (#8484 ) There are two cases in the fetch pass where a partition is unnecessarily looked up from the partition Pool, when one is already accessible. This will be a fairly minor improvement on high partition count clusters, but could be worth 1% from some profiles I have seen. More importantly, the code is cleaner this way. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
Lucas Bradstreet	7da13024b4	MINOR: avoid autoboxing in FetchRequest.PartitionData.equals FetchRequest.PartitionData.equals unnecessarily uses Object.equals generating a lot of allocations due to boxing, even though primitives are being compared. This is shown in the allocation profile below. Note that the CPU overhead is negligble. ![image](https://user-images.githubusercontent.com/252189/79079019-46686300-7cc1-11ea-9bc9-44fd17bae888.png) Author: Lucas Bradstreet <lucasbradstreet@gmail.com> Reviewers: Chia-Ping Tsai, Gwen Shapira Closes #8473 from lbradstreet/avoid-boxing-partition-data-equals	5 years ago
Rajini Sivaram	8820055744	KAFKA-9797; Fix TestSecurityRollingUpgrade.test_enable_separate_interbroker_listener (#8403 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>	5 years ago
Steve Rodrigues	b8c292c361	[KAFKA-9826] Handle an unaligned first dirty offset during log cleaning. (#8469 ) In KAFKA-9826, a log whose first dirty offset was past the start of the active segment and past the last cleaned point resulted in an endless cycle of picking the segment to clean and discarding it. Though this didn't interfere with cleaning other log segments, it kept the log cleaner thread continuously busy (potentially wasting CPU and impacting other running threads) and filled the logs with lots of extraneous messages. This was determined to be because the active segment was getting mistakenly picked for cleaning, and because the logSegments code handles (start == end) cases only for (start, end) on a segment boundary: the intent is to return a null list, but if they're not on a segment boundary, the routine returns that segment. This fix has two parts: It changes logSegments to handle start==end by returning an empty List always. It changes the definition of calculateCleanableBytes to not consider anything past the UncleanableOffset; previously, it would potentially shift the UncleanableOffset to match the firstDirtyOffset even if the firstDirtyOffset was past the firstUncleanableOffset. This has no real effect now in the context of the fix for (1) but it makes the code read more like the model that the code is attempting to follow. These changes require modifications to a few test cases that handled this particular test case; they were introduced in the context of KAFKA-8764. Those situations are now handled elsewhere in code, but the tests themselves allowed a DirtyOffset in the active segment, and expected an active segment to be selected for cleaning. Reviewer: Jun Rao <junrao@gmail.com>	5 years ago
Lucas Bradstreet	60684cbeaa	MINOR: cleaner resume log message is misleading When the LogManager resumes cleaning it states that compaction is resumed, however the topic in question is not necessarily a compacted one. Author: Lucas Bradstreet <lucas@confluent.io> Reviewers: Gwen Shapira, Chia-Ping Tsai Closes #8466 from lbradstreet/bad-cleaning-message	5 years ago
Levani Kokhreidze	742f9281d9	KAFKA-8611: Refactor KStreamRepartitionIntegrationTest (#8470 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Ewen Cheslack-Postava	cadb3499ff	MINOR: Upgrade ducktape to 0.7.7 (#8487 ) This fixes a version pinning issue where a transitive dependency had a major version upgrade that a dependency did not account for, breaking the build. Reviewers: Andrew Egelhofer <aegelhofer@confluent.io>, Matthias J. Sax <matthias@confluent.io>	5 years ago
David Jacot	7c7d55dbd8	KAFKA-9539; Add leader epoch in StopReplicaRequest (KIP-570) (#8257 ) This PR adds the leader epoch field to `StopReplicaRequest` as documented in KIP-570. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Boyang Chen	ea47a885b1	MINOR: remove stream simple benchmark suite (#8353 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Jason Gustafson	799183b3f8	KAFKA-9842; Add test case for OffsetsForLeaderEpoch grouping in Fetcher (#8457 ) This is a follow-up to #8077. The bug exposed a testing gap in how we group partitions. This patch adds a test case which reproduces the reported problem. Reviewers: David Arthur <mumrah@gmail.com>	5 years ago
Eric Bolinger	6216c886de	KAFKA-9853: Improve performance of Log.fetchOffsetByTimestamp (#8474 ) The previous code did not use the collection produced by `takeWhile()`. It only used the length of that collection to select the next element. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
A. Sophie Blee-Goldman	37f20c6924	HOTFIX: need to cleanup any tasks closed in TaskManager (#8463 ) We were hitting an IllegalStateException: There is already a changelog registered for ... in trunk-eos due to failing to call TaskManager#cleanup on unrevoekd tasks that we end up closing in handleAssignment after failing to batch commit. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	430e00ea95	KAFKA-8436: use automated protocol for AddOffsetsToTxn (#7015 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	5 years ago
Sönke Liebau	e032a36070	KAFKA-3720: Change TimeoutException to BufferExhaustedException when no memory can be allocated for a record within max.block.ms (#8399 ) Change TimeoutException to BufferExhaustedException when no memory can be allocated for a record within max.block.ms Refactored BufferExhaustedException to be a subclass of TimeoutException so existing code that catches TimeoutExceptions keeps working. Added handling to count these Exceptions in the metric "buffer-exhausted-records". Test Strategy There were existing test cases to check this behavior, which I refactored. I then added an extra case to check whether the expected Exception is actually thrown, which was not covered by current tests. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>	5 years ago
Vikas Singh	a276c54637	MINOR: Allow a single struct to be a field in the protocol spec (#8413 ) Remove the restriction in the protocol generation code that a structure field needs to be part of an array. Reviewers: Colin P. McCabe <cmccabe@apache.org>	5 years ago
Matthias J. Sax	20e4a74c35	KAFKA-9832: Extend Streams system tests for EOS-beta (#8443 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
A. Sophie Blee-Goldman	ea1e634664	KAFKA-6145: KIP-441: avoid unnecessary movement of standbys (#8436 ) Reviewers: John Roesler <vvcephei@apache.org>	5 years ago
Jason Gustafson	b02bdd3227	MINOR: Only start log dir fetcher after LeaderAndIsr epoch validation (#8460 ) Currently a `LeaderAndIsr` request with a stale leader epoch for some partition may still result in the starting of the log dir fetcher for that partition (if the future log exists). I am not sure if this causes any correctness problem since we don't use any state from the request to start the fetcher, but it seems unnecessary to rely on this side effect. Reviewers: Jun Rao <junrao@gmail.com>	5 years ago
Matthias J. Sax	6b6afce60f	MINOR: Fix JavaDocs markup (#8459 ) Reviewers: Bill Bejeck <bbejeck@gmail.com>	5 years ago
Levani Kokhreidze	e131a99634	KAFKA-8611: Add KStream#repartition operation (#7170 ) Implements KIP-221. Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>	5 years ago
A. Sophie Blee-Goldman	0470e2bc95	KAFKA-6145: KIP-441: fix flaky shouldEnforceRebalance test in StreamThreadTest (#8452 ) Reviewers: Boyang Chen <boyang@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
Colin Patrick McCabe	bf6dffe93b	KAFKA-9309: Add the ability to translate Message classes to and from JSON (#7844 ) Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rdagostino@confluent.io>	5 years ago
Matthias J. Sax	73ec7304b9	KAFKA-9748: Extend Streams integration tests for EOS beta (#8441 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
SoontaekLim	179be72e30	KAFKA-9642: Change "BigDecimal(double)" constructor to "BigDecimal.valueOf(double)" (#8212 ) Co-authored-by: Soontaek Lim <soontaek.lim@ultratendency.com> Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Tom Bentley	371ad143a6	KAFKA-9691: Fix NPE by waiting for reassignment request (#8317 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Chia-Ping Tsai <chia7712@gmail.com>	5 years ago
Tom Bentley	c84e6ab491	KAFKA-9433: Use automated protocol for AlterConfigs request and response (#8315 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Boyang Chen <boyang@confluent.io>	5 years ago
Dezhi “Andy” Fang	0fad944ca7	KAFKA-9583; Use topic-partitions grouped by node to send OffsetsForLeaderEpoch requests (#8077 ) In `validateOffsetsAsync` in t he consumer, we group the requests by leader node for efficiency. The list of topic-partitions are grouped from `partitionsToValidate` (all partitions) to `node` => `fetchPostitions` (partitions by node). However, when actually sending the request with `OffsetsForLeaderEpochClient`, we use `partitionsToValidate`, which is the list of all topic-partitions passed into `validateOffsetsAsync`. This results in extra partitions being included in the request sent to brokers that are potentially not the leader for those partitions. This PR fixes the issue by using `fetchPositions`, which is the proper list of partitions that we should send in the request. Additionally, a small typo of API name in `OffsetsForLeaderEpochClient` is corrected (it originally referenced `LisfOffsets` as the API name). Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>	5 years ago
A. Sophie Blee-Goldman	ed3a7157e0	KAFKA-6145: KIP-441 Move tasks with caught-up destination clients right away (#8425 ) Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
Jason Gustafson	778b1e3f54	KAFKA-9835; Protect `FileRecords.slice` from concurrent write (#8451 ) A read from the end of the log interleaved with a concurrent write can result in reading data above the expected read limit. In particular, this would allow a read above the high watermark. The root of the problem is consecutive calls to `sizeInBytes` in `FileRecords.slice` which do not account for an increase in size due to a concurrent write. This patch fixes the problem by using a single call to `sizeInBytes` and caching the result. Reviewers: Ismael Juma <ismael@juma.me.uk>	5 years ago
A. Sophie Blee-Goldman	98ea773a22	KAFKA-6145: KIP-441 Pt. 6 Trigger probing rebalances until group is stable (#8409 ) Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago

1 2 3 4 5 ...

7406 Commits (0f8dc1fcd720fbb7eb6152ef40ecaafdd6b04e15) All Branches Search

7406 Commits (0f8dc1fcd720fbb7eb6152ef40ecaafdd6b04e15)

All Branches