src-kafka

Author	SHA1	Message	Date
Guozhang Wang	fcf45e1fac	MINOR: Further reduce runtime for metrics integration tests (#8514 ) 1. In both RocksDBMetrics and Metrics integration tests, we do not need to wait for consumer to consume records from output topics since the sensors / metrics are registered upon task creation. 2. Merged the two test cases of RocksDB with one app that creates two state stores (non-segmented and segmented). With these two changes, local runtime of these two tests reduced from 2min+ and 3min+ to under a minute. Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>	5 years ago
Chia-Ping Tsai	cfea096a8d	HOTFIX: fix checkstyle error of RocksDBStoreTest and flaky RocksDBTimestampedStoreTest.shouldOpenExistingStoreInRegularMode (#8515 ) 1. Fix broken build 2. Fix flaky RocksDBTimestampedStoreTest.shouldOpenExistingStoreInRegularMode Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
A. Sophie Blee-Goldman	6ea3eedfd8	MINOR: cleanup RocksDBStore tests (#8510 ) One of the new rocksdb unit tests creates a non-temporary rocksdb directory wherever the test is run from, with some rocksdb files left behind after the test(s) are done. We should use the tempDirectory dir for this testing Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Matthias J. Sax	ecde596180	KAFKA-9818: Fix flaky test in RecordCollectorTest (#8507 ) Reviewer: John Roesler <john@confluent.io>	5 years ago
John Roesler	0f8dc1fcd7	KAFKA-6145: KIP-441: Add test scenarios to ensure rebalance convergence (#8475 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>	5 years ago
Bruno Cadonna	a0173ec45d	KAFKA-9881: Convert integration test to verify measurements from RocksDB to unit test (#8501 ) The integration test RocksDBMetricsIntegrationTest takes pretty long to complete. Most of the runtime is spent in the two tests that verify whether the RocksDB metrics get actual measurements from RocksDB. Those tests need to wait for the thread that collects the measurements of the RocksDB metrics to trigger the first recordings of the metrics. This PR adds a unit test that verifies whether the Kafka Streams metrics get the measurements from RocksDB and removes the two integration tests that verified it before. The verification of the creation and scheduling of the RocksDB metrics recording trigger thread is already contained in KafkaStreamsTest and consequently it is not part of this PR. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Matthias J. Sax	770b095e91	KAFKA-9819: Fix flaky test in StoreChangelogReaderTest (#8488 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>	5 years ago
Guozhang Wang	a0119aa859	HOTFIX: fix active task process ratio metric recording	5 years ago
Boyang Chen	df41713d64	KAFKA-9779: Add Stream system test for 2.5 release (#8378 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	5 years ago
Piotr Fras	f7d2b1baf7	KAFKA-7885: TopologyDescription violates equals-hashCode contract. (#6210 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	5 years ago
A. Sophie Blee-Goldman	640be46ef5	HOTFIX: don't close or wipe out someone else's state (#8478 ) When it comes to actually closing a task we now treat all states exactly the same, and call StateManagerUtil#closeStateManager regardless of whether it's in CREATED or RESTORING or RUNNING Unfortunately StateManagerUtil doesn't actually check to make sure that we actually own the lock for this task's state. During a dirty close with eos enabled, we wipe the state -- but in some cases, this means deleting the state out from under another StreamThread who is still in the process of revoking this task. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Matthias J. Sax	17f9879261	KAFKA-9832: extend Kafka Streams EOS system test (#8440 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
Levani Kokhreidze	742f9281d9	KAFKA-8611: Refactor KStreamRepartitionIntegrationTest (#8470 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Boyang Chen	ea47a885b1	MINOR: remove stream simple benchmark suite (#8353 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
A. Sophie Blee-Goldman	37f20c6924	HOTFIX: need to cleanup any tasks closed in TaskManager (#8463 ) We were hitting an IllegalStateException: There is already a changelog registered for ... in trunk-eos due to failing to call TaskManager#cleanup on unrevoekd tasks that we end up closing in handleAssignment after failing to batch commit. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Matthias J. Sax	20e4a74c35	KAFKA-9832: Extend Streams system tests for EOS-beta (#8443 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
A. Sophie Blee-Goldman	ea1e634664	KAFKA-6145: KIP-441: avoid unnecessary movement of standbys (#8436 ) Reviewers: John Roesler <vvcephei@apache.org>	5 years ago
Matthias J. Sax	6b6afce60f	MINOR: Fix JavaDocs markup (#8459 ) Reviewers: Bill Bejeck <bbejeck@gmail.com>	5 years ago
Levani Kokhreidze	e131a99634	KAFKA-8611: Add KStream#repartition operation (#7170 ) Implements KIP-221. Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>	5 years ago
A. Sophie Blee-Goldman	0470e2bc95	KAFKA-6145: KIP-441: fix flaky shouldEnforceRebalance test in StreamThreadTest (#8452 ) Reviewers: Boyang Chen <boyang@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
Matthias J. Sax	73ec7304b9	KAFKA-9748: Extend Streams integration tests for EOS beta (#8441 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
A. Sophie Blee-Goldman	ed3a7157e0	KAFKA-6145: KIP-441 Move tasks with caught-up destination clients right away (#8425 ) Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
A. Sophie Blee-Goldman	98ea773a22	KAFKA-6145: KIP-441 Pt. 6 Trigger probing rebalances until group is stable (#8409 ) Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
John Roesler	29e08fd2c2	KAFKA-8410: Part 1: processor context bounds (#8414 ) Add type bounds to the ProcessorContext, which bounds the types that can be forwarded to child nodes. Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Matthias J. Sax	731630e866	KAFKA-9818: improve error message to debug test (#8423 ) Reviewers: A. Sophie Blee Goldman <sophie@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
Boyang Chen	f1850162de	MINOR: Should cleanup the tasks after dirty close (#8433 ) Some tasks get closed inside HandleAssignment and did not remove from the task manager bookkeep list. The next time they would be re-closed which is illegal state. Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Guozhang Wang	82dff1db54	KAFKA-9753: A few more metrics to add (#8371 ) Instance-level: * number of alive stream threads Thread-level: * avg / max number of records polled from the consumer per runOnce, INFO * avg / max number of records processed by the task manager (i.e. across all tasks) per runOnce, INFO Task-level: * number of current buffered records at the moment (i.e. it is just a dynamic gauge), DEBUG. Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <john@confluent.io>	5 years ago
Boyang Chen	712ac5203e	KAFKA-9793: Expand the try-catch for task commit in HandleAssignment (#8402 ) As title suggests, we would like to broaden this check so that we don't fail to close a doom-to-cleanup task. Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	0eab92012b	HOTFIX: fix compilation error (#8424 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <guozhang@confluent.io	5 years ago
Matthias J. Sax	ab5e4f52ec	MINOR: Refactor StreamsProducer (#8380 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Andrew Choi <a24choi@edu.uwaterloo.ca>	5 years ago
A. Sophie Blee-Goldman	6e0d553350	MINOR: clean up Streams assignment classes and tests (#8406 ) First set of cleanup pushed to followup PR after KIP-441 Pt. 5. Main changes are: 1. Moved `RankedClient` and the static `buildClientRankingsByTask` to a new file 2. Moved `Movement` and the static `getMovements` to a new file (also renamed to `TaskMovement`) 3. Consolidated the many common variables throughout the assignment tests to the new `AssignmentTestUtils` 4. New utility to generate comparable/predictable UUIDs for tests, and removed the generic from `TaskAssignor` and all related classes Reviewers: John Roesler <vvcephei@apache.org>, Andrew Choi <a24choi@edu.uwaterloo.ca>	5 years ago
John Roesler	726a7d5de2	KAFKA-9812: fix infinite loop in test code (#8411 ) Reviewers: Boyang Chen <boyang@confluent.io>	5 years ago
Bill Bejeck	9783b85fdd	KAFKA-9739: Fixes null key changing child node (#8400 ) For some context, when building a streams application, the optimizer keeps track of the key-changing operations and any repartition nodes that are descendants of the key-changer. During the optimization phase (if enabled), any repartition nodes are logically collapsed into one. The optimizer updates the graph by inserting the single repartition node between the key-changing node and its first child node. This graph update process is done by searching for a node that has the key-changing node as one of its direct parents, and the search starts from the repartition node, going up in the parent hierarchy. The one exception to this rule is if there is a merge node that is a descendant of the key-changing node, then during the optimization phase, the map tracking key-changers to repartition nodes is updated to have the merge node as the key. Then the optimization process updates the graph to place the single repartition node between the merge node and its first child node. The error in KAFKA-9739 occurred because there was an assumption that the repartition nodes are children of the merge node. But in the topology from KAFKA-9739, the repartition node was a parent of the merge node. So when attempting to find the first child of the merge node, nothing was found (obviously) resulting in StreamException(Found a null keyChangingChild node for..) This PR fixes this bug by first checking that all repartition nodes for optimization are children of the merge node. This PR includes a test with the topology from KAFKA-9739. Reviewers: John Roesler <john@confluent.io>	5 years ago
Boyang Chen	7f640f13b4	KAFKA-9776: Downgrade TxnCommit API v3 when broker doesn't support (#8375 ) Revert the decision for the sendOffsetsToTransaction(groupMetadata) API to fail with old version of brokers for the sake of making the application easier to adapt between versions. This PR silently downgrade the TxnOffsetCommit API when the build version is small than 3. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	6ddbf4d800	KAFKA-9809: Shrink transaction timeout for streams (#8407 ) As documented in the KIP: We shall set `transaction.timout.ms` default to 10000 ms (10 seconds) on Kafka Streams. Reviewer: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
A. Sophie Blee-Goldman	2322bc0a6f	KAFKA-6145: Pt. 5 Implement high availability assignment (#8337 ) Adds a new TaskAssignor implementation, currently hidden behind an internal feature flag, that implements the high availability algorithm of KIP-441. Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
Guozhang Wang	a2092ecd7a	HOTFIX: remove redundant check for QueryableStateIntegrationTest	5 years ago
Matthias J. Sax	cc59150f40	KAFKA-9748: extend EosIntegrationTest for EOS-beta (#8331 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
Bruno Cadonna	e6a6b69acf	MINOR: Improve close tests of caching state store (#8386 ) Reviewers: John Roesler <vvcephei@apache.org>, Matthias J. Sax <matthias@confluent.io>, Andrew Choi <a24choi@edu.uwaterloo.ca>	5 years ago
Ismael Juma	90bbeedf52	MINOR: Fix Scala 2.13 compiler warnings (#8390 ) Once Scala 2.13.2 is officially released, I will submit a follow up PR that enables `-Xfatal-warnings` with the necessary warning exclusions. Compiler warning exclusions were only introduced in 2.13.2 and hence why we have to wait for that. I used a snapshot build to test it in the meantime. Changes: * Remove Deprecated annotation from internal request classes * Class.newInstance is deprecated in favor of Class.getConstructor().newInstance * Replace deprecated JavaConversions with CollectionConverters * Remove unused kafka.cluster.Cluster * Don't use Map and Set methods deprecated in 2.13: - collection.Map +, ++, -, --, mapValues, filterKeys, retain - collection.Set +, ++, -, -- * Add scala-collection-compat dependency to streams-scala and update version to 2.1.4. * Replace usages of deprecated Either.get and Either.right * Replace usage of deprecated Integer(String) constructor * `import scala.language.implicitConversions` is not needed in Scala 2.13 * Replace usage of deprecated `toIterator`, `Traversable`, `seq`, `reverseMap`, `hasDefiniteSize` * Replace usage of deprecated alterConfigs with incrementalAlterConfigs where possible * Fix implicit widening conversions from Long/Int to Double/Float * Avoid implicit conversions to String * Eliminate usage of deprecated procedure syntax * Remove `println`in `LogValidatorTest` instead of fixing the compiler warning since tests should not `println`. * Eliminate implicit conversion from Array to Seq * Remove unnecessary usage of 3 argument assertEquals * Replace `toStream` with `iterator` * Do not use deprecated SaslConfigs.DEFAULT_SASL_ENABLED_MECHANISMS * Replace StringBuilder.newBuilder with new StringBuilder * Rename AclBuffers to AclSeqs and remove usage of `filterKeys` * More consistent usage of Set/Map in Controller classes: this also fixes deprecated warnings with Scala 2.13 * Add spotBugs exclusion for inliner artifact in KafkaApis with Scala 2.12. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	5 years ago
Matthias J. Sax	6a49ede947	KAFKA-9441: Cleanup Streams metrics for removed task commit latency metrics (#8356 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bruno Cadonna <bruno@confluent.io>	5 years ago
Guozhang Wang	353aa6206d	KAFKA-9753: Add active tasks process ratio (#8370 ) Measure the percentage ratio the stream thread spent on processing each task among all assigned active tasks (KIP-444). Also add unit tests to cover the added metrics in this PR and the previous #8358. Also trying to fix the flaky test reported in KAFKA-5842 Co-authored-by: John Roesler <vvcephei@apache.org> Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	5 years ago
Matthias J. Sax	6ad5407350	KAFKA-9719: Streams with EOS-beta should fail fast for older brokers (#8367 ) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	5 years ago
Bruno Cadonna	c595470713	KAFKA-9770: Close underlying state store also when flush throws (#8368 ) When a caching state store is closed it calls its flush() method. If flush() throws an exception the underlying state store is not closed. This commit ensures that state stores underlying a wrapped state stores are closed even when preceding operations in the close method throw. Co-authored-by: John Roesler <vvcephei@apache.org> Reviewers: John Roesler <vvcephei@apache.org>, Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>	5 years ago
John Roesler	3f5ad4640b	MINOR: optimize integration test shutdown (#8366 ) * delete topics before tearing down multi-node clusters to avoid leader elections during shutdown * tear down all nodes concurrently instead of sequentially Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Guozhang Wang	9d9b59fccc	KAFKA-9756: Process more than one record of one task at a time (#8358 ) 1. Within a single while loop, process the tasks in AAABBBCCC instead of ABCABCABC. This also helps the follow-up PR to time the per-task processing ratio to record less time, hence less overhead. 2. Add thread-level process / punctuate / poll / commit ratio metrics. 3. Fixed a few issues discovered (inline commented). Reviewers: John Roesler <vvcephei@apache.org>	5 years ago
A. Sophie Blee-Goldman	e1cbefef60	HOTFIX: fix log message in version probing system test (#8341 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	5 years ago
Bruno Cadonna	1fbddd853d	KAFKA-6145: Add balanced assignment algorithm (#8334 ) This algorithm assigns tasks to clients and tries to - balance the distribution of the partitions of the same input topic over stream threads and clients, i.e., data parallel workload balance - balance the distribution of work over stream threads. The algorithm does not take into account potentially existing states on the client. The assignment is considered balanced when the difference in assigned tasks between the stream thread with the most tasks and the stream thread with the least tasks does not exceed a given balance factor. The algorithm prioritizes balance over stream threads higher than balance over clients. Reviewers: John Roesler <vvcephei@apache.org>	5 years ago
A. Sophie Blee-Goldman	8012f47e2c	MINOR: move some client methods to new ClientUtils (#8328 ) * move to new ClientUtils * checkstyle * fix KafkaStreamsTest * checkstyle again	5 years ago
Guozhang Wang	1b36e11967	MINOR: Restore and global consumers should never have group.instance.id (#8322 ) And hence restore / global consumers should never expect FencedInstanceIdException. When such exception is thrown, it means there's another instance with the same instance.id taken over, and hence we should treat it as fatal and let this instance to close out instead of handling as task-migrated. Reviewers: Boyang Chen <boyang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	5 years ago

1 2 3 4 5 ...

1759 Commits (4b97e50eea977a2a9020e6ed49516a1904361f12)