Some exception paths not previously covered. Extracted `ensureCopartitioning` into a static class.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2448 from dguy/KAFKA-4644
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2403 from mjsax/addStreamsClientCompatibilityTest
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#2406 from ijuma/kafka-4636-per-listener-security-settings
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Xavier Léauté <xavier@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2454 from guozhangwang/KMinor-trace-logging-add-metrics-twice
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Eno Thereska, Damian Guy, Guozhang Wang
Closes#2441 from ijuma/streams-kafka-client-drops-security-configs
`bin-kafka-console-producer.sh` should be `bin/kafka-console-producer.sh`.
Author: Will Marshall <wcm214@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2410 from wmarshall484/typo-fix
This is the HOTFIX PR for any issues detected with KIP-104 until code freeze. Note: do not merge until close to code freeze.
The name changes reflect feedback received while writing the documentation.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Dan Norwood, Guozhang Wang
Closes#2398 from enothereska/hotfix-streams-metrics
Found a few recently added unit tests did not close KStreamTestDriver after the test itself is closed; this can cause RocksDB virtual function called if the contained topology has some persistent store since they will be initialized but not closed in time.
MINOR fix: found that when closing KStreamTestDriver, we need to first flushing all stores before closing any of them; this is triggered from the `KTableKTableLeftJoin.shouldNotThrowIllegalStateExceptionWhenMultiCacheEvictions`.
MINOR fix: in CachingXXXStore, the `name` field is actually used as the cache's namespace, not really the store name or its corresponding topic name. Fixed it by renaming it to `cacheName` and use `this.name()` elsewhere which will call the underlying store's name.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ewen Cheslack-Postava, Eno Thereska, Damian Guy
Closes#2432 from guozhangwang/K3502-kstream-builder-test
close the sessions store in `After` to release rocksdb resources.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2430 from dguy/minor-close-sesion-store
The root cause of this issue is that in InternalTopicManager we are creating topics one-at-a-time, and for this test, there are 31 topics to be created, as a result it is possible that the consumer could time out during the assignment in rebalance, and the next leader has to do the same again because of "makeReady" calls are one-at-a-time.
This patch batches the topics into a single create request and also use the StreamsKafkaClient directly to fetch metadata for validating the created topics. Also optimized a bunch of inefficient code in InternalTopicManager and StreamsKafkaClient.
Minor cleanup: make the exception message more informative in integration tests.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Jason Gustafson
Closes#2405 from guozhangwang/K3896-fix-kstream-repartition-join-test
- also some code refactoring and bug fixes
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2337 from mjsax/javaDocImprovements4
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2401 from mjsax/kafka-4671-window-retention-policy
1. In StreamThread, always use subscribe(Pattern, ..) function in order to avoid sending MetadataRequest with specific topic names and cause brokers to possibly auto-create subscribed topics; the pattern is generated as "topic-1|topic-2..|topic-n".
2. In ConsumerCoordinator, let the leader to refresh its metadata if the generated assignment contains some topics that is not contained in the subscribed topics; also in SubscriptionState, modified the verification for regex subscription to against the regex pattern instead of the matched topics since the returned assignment may contain some topics not yet created when joining the group but existed after the rebalance; also modified some unit tests in `KafkaConsumerTest` to accommodate the above changes.
3. Minor cleanup: changed String[] to List<String> to avoid overloaded functions.
4. Minor cleanup: enforced strong typing in SinkNodeFactory and removed unnecessary unchecked tags.
5. Minor cleanup: augmented unit test error message and fixed a potential transient failure in KafkaStreamTest.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2379 from guozhangwang/K4633-regex-pattern
Variance changes introduced in KIP-100 cause compilation failures with lambda expression in Java 8. To my knowledge this only affects the following method
`KStreams.transform(TransformerSupplier<...>, String...)`
prior to the changes it was possible to write:
`streams.transform(MyTransformer::new)`
where `MyTransformer` extends `Transformer`
After the changes the Java compiler is unable to infer correct return types for the lambda expressions. This change fixed this by reverting to invariant return types for transformer suppliers.
please cherry-pick into 0.10.2.x
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Ismael Juma, Damian Guy, Guozhang Wang
Closes#2402 from xvrl/lambdas-oh-my
ZK removed reveal a bug in `StreamPartitionAssigner` but did not fix it properly. This is a follow up bug fix.
Issue:
- If topic metadata is missing, `StreamPartitionAssigner` should not create any affected tasks that consume topics with missing metadata.
- Depending downstream tasks should not be create either.
- For tasks that are not created, no store changelog topics (if any) should get created
- For tasks that write output to not-yet existing internal repartitioning topics, those repartitioning topics should not get created
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Guozhang Wang
Closes#2404 from mjsax/kafka-4060-zk-test-follow-up
This log message tends to be extremely verbose when state stores are being restored
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2412 from xvrl/reduce-verbosity
Re-branched the trunk and applied the changes to the new branch to simplify commit log.
Author: Hojjat Jafarpour <hojjat@Hojjat-Jafarpours-MBP.local>
Reviewers: Ismael Juma, Damian Guy, Eno Thereska, Guozhang Wang
Closes#2389 from hjafarpour/KAFKA-4060-Remove-ZkClient-dependency-in-Kafka-Streams-followup-from-trunk
Address Ismael's comments upon merging
This class doesn't need to override this method as it is handled appropriately by the super class
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2397 from dguy/hotfix-npe-state-store
interface for `Processor` in comments incorrectly had `transform` rather than `process`.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Michael G. Noll, Ismael Juma <ismael@juma.me.uk>
Closes#2396 from dguy/minor-javadoc
In RocksDBStore, options / wOptions / fOptions are constructed in the constructor, which needs to be dismissed in the close() call; however in some tests, the generated topology is not initialized at all, and hence the corresponding state stores are supposed to not be able to be closed as well since their `init` function is not called. This could cause the above option objects to be not released.
This is fixed in this patch to move the logic out of constructor and inside `init` functions, so that no RocksDB objects will be created in the constructor only. Also some minor cleanups:
1. In KStreamTestDriver.close(), we lost the logic to close the state stores but only call `flush`; it is now changed back to call both.
2. Moved the forwarding logic from KStreamTestDriver to MockProcessorContext to remove the mutual dependency: these functions should really be in ProcessorContext, not the test driver.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2381 from guozhangwang/K3502-pure-virtual-function-unit-tests
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2391 from mjsax/kafka-4060-zk-follow-up-system-tests
This is a follow up of https://github.com/apache/kafka/pull/2166 - refactoring the store hierarchies as requested
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2360 from dguy/state-store-refactor
After debugging this i can see the times that it fails there is a race between when the topic is actually created/ready on the broker and when the assignment happens. When it fails `StreamPartitionAssignor.assign(..)` gets called with a `Cluster` with no topics. Hence the test hangs as no tasks get assigned. To fix this I added a `waitForTopics` method to `EmbeddedKafkaCluster`. This will wait until the topics have been created.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2371 from dguy/integration-test-fix
Remove applicationId parameter as it is no longer used.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2385 from dguy/minor-remove-unused-param
This PR is extracted from https://github.com/apache/kafka/pull/2333 as an incremental fix to ease the reviewing:
1. Removed `storeToProcessorNodeMap` from ProcessorTopology since it was previously used to set the context current record, and can now be replaced with the dirty entry in the named cache.
2. Replaced `sourceStoreToSourceTopic` from ProcessorTopology with `storeToChangelogTopic` map, which includes the corresponding changelog topic name for all stores that are changelog enabled.
3. Modified `ProcessorStateManager` to rely on `sourceStoreToSourceTopic` when retrieving the changelog topic; this makes the second parameter `loggingEnabled` in `register` not needed any more, and we can deprecate the old API with a new one.
4. Also fixed a minor issue in `KStreamBuilder`: if the storeName is not provided in the `table(..)` function, do not create the underlying materialized store. Modified the unit tests to cover this case.
5. Fixed a bunch of other unit tests failures that are exposed by this refactoring, in which we are not setting the applicationId correctly when constructing the mocking processor topology.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Ewen Cheslack-Postava
Closes#2338 from guozhangwang/KMinor-refactor-state-to-changelogtopic
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira <cshapi@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#2354 from ijuma/kafka-4565-separation-of-internal-and-external-traffic
and remove some unnecessary SuppressWarnings annotations
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Ismael Juma, Guozhang Wang
Closes#2363 from xvrl/kip-100-followup
Add Global Tables to KafkaStreams. Global Tables are fully replicated once-per instance of KafkaStreams. A single thread is used to update them. They can be used to join with KStreams, KTables, and other GlobalKTables. When participating in a join a GlobalKTable is only ever used to perform a lookup, i.e., it will never cause data to be forwarded to downstream processor nodes.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang
Closes#2244 from dguy/global-tables
- TimeWindows represent half-open time intervals while SessionWindows represent closed time intervals
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2342 from mjsax/kafka-3452-session-window-follow-up
mjsax
Here's my first pass at finer grained auto offset reset strategies.
I've left TODO comments about whether we want to consider adding this to `KGroupedTable.aggregate` and `KStreamImpl` when re-partitioning a source.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#2007 from bbejeck/KAFKA-4114_allow_different_offset_reset_strategies
Kafka Streams: add granular metrics per node and per task, also expose ability to register non latency metrics in StreamsMetrics
Also added different recording levels to Metrics.
This is joint contribution from Eno Thereska and Aarti Gupta.
from https://github.com/apache/kafka/pull/1362#issuecomment-218326690-------
We can consider adding metrics for process / punctuate / commit rate at the granularity of each processor node in addition to the global rate mentioned above. This is very helpful in debugging.
We can consider adding rate / total cumulated metrics for context.forward indicating how many records were forwarded downstream from this processor node as well. This is helpful in debugging.
We can consider adding metrics for each stream partition timestamp.
This is helpful in debugging.
## Besides the latency metrics, we can also add throughput latency in terms of source records consumed.
More discussions here https://issues.apache.org/jira/browse/KAFKA-3715, KIP-104, KIP-105
Author: Eno Thereska <eno@confluent.io>
Author: Aarti Gupta <aartiguptaa@gmail.com>
Reviewers: Greg Fodor, Ismael Juma, Damian Guy, Guozhang Wang
Closes#1446 from aartigupta/trunk
The client should send older versions of requests to the broker if necessary.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2264 from cmccabe/KAFKA-4507
dguy guozhangwang This is a new PR for KAFKA-4060.
Author: Hojjat Jafarpour <hojjat@Hojjat-Jafarpours-MBP.local>
Author: Hojjat Jafarpour <hojjat@HojjatJpoursMBP.attlocal.net>
Reviewers: Damian Guy, Matthias J. Sax, Isamel Juma, Guozhang Wang
Closes#1884 from hjafarpour/KAFKA-4060-Remove-ZkClient-dependency-in-Kafka-Streams-new
Make appropriate methods contravariant in key and value types.
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2205 from xvrl/streams-contravariance
Rename `SessionStore.findSessionsToMerge` to `findSessions`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2339 from dguy/minor-findsession-rename