Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <me@ewencp.org>
Closes#1746 from guozhangwang/K4049-RegexSourceIntegrationTest-failure
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Michael G. Noll <michael@confluent.io>, Greg Fodor <gfodor@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1530 from guozhangwang/K3769-per-thread-metrics
guozhangwang enothereska mjsax miguno please take a look. A few things that need to be clarified
1. I've added StreamsConfig.USER_ENDPOINT_CONFIG, but should we have separate configs for host and port or is this one config ok?
2. `HostState` in the KIP has a byte[] field - not sure why and what it would be populated with
3. I've changed the API to return `List<KafkaStreamsInstance>` as opposed to `Map<HostInfo, Set<TaskMetadata>>` as i find this far more intuitive to work with.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Michael G. Noll, Eno Thereska, Guozhang Wang
Closes#1576 from dguy/kafka-3914v2
Added non null checks to parameters supplied via the DSL and `TopologyBuilder`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Edward Ribeiro <edward.ribeiro@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1711 from dguy/kafka-3936
It previously hardcoded it.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1613 from ijuma/kafka-3954-consumer-internal-topics-from-broker
I affirm that the contribution is my original work and that I license the work to the project under the project's open source license.
This cleans up misbehaviour that was introduce while fixing KAFKA-3817. It is impossible for a non-count aggregate to be build, when the addition happens before the removal. IMHO making sure that these details are correct is very important.
This PR has local test errors. It somehow fails the ResetIntegrationTest. It doesn't quite appear to me why but it looks like this PR breaks it, especially because the error appears with the ordering of the events. Still I am unable to find where I could have broken it. Maybe not seems to fail on trunk aswell.
Author: jfilipiak <Jan.Filipiak@trivago.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1705 from Kaiserchen/KAFKA-3817-preserve-order-for-aggreagators
Rename StateStoreProvider.getStores(...) to StateStoreProvider.stores(...) as this is consistent with the naming of other 'getters' in the public API.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1699 from dguy/minor-method-rename
We are not joining in a window here.
Author: Jendrik Poloczek <jendrik.poloczek@hivestreaming.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1692 from jpzk/trunk
The latter has been deprecated.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1690 from ijuma/rocks-db-dispose-methods-deprecated
The StreamTask is owned by a specific thread, so it doesn't seem necessary to synchronized the processing of the records as discussed with guozhangwang on the dev mailing list
Author: PierreCoquentin <pierre.coquentin@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1688 from PierreCoquentin/trunk
Add prefixes for consumer and producer configs to StreamsConfig, but be backward compatible.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#1649 from dguy/kafka-3929
The KafkaStreamsTest can occasionally hang if the test doesn't run fast enough. This is due to there being no brokers available on the broker.urls provided to the StreamsConfig. The KafkaConsumer does a poll and blocks causing the test to never complete.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1693 from dguy/kafka-streams-test
(cherry picked from commit ce34614a43)
Signed-off-by: Ismael Juma <ismael@juma.me.uk>
moved streams application reset tool from tools to core
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1685 from mjsax/moveResetTool
(cherry picked from commit f2405a73ea)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1673 from mjsax/hotfix
(cherry picked from commit ad1dab9c3d)
Signed-off-by: Ismael Juma <ismael@juma.me.uk>
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Roger Hoover, Matthias J. Sax, Guozhang Wang
Closes#1619 from enothereska/KAFKA-3858-print-topology
Add new config StreamsConfig.ROCKSDB_CONFIG_SETTER_CLASS_CONFIG to enable advanced
RocksDB users to override default RocksDB configuration
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Roger Hoover, Dan Norwood, Eno Thereska, Guozhang Wang
Closes#1640 from dguy/kafka-3740-listener
Merge of KAFKA-3812 caused a compilation error in StreamThreadStateStoreProviderTest
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1641 from dguy/fix-compile-error
Move all state directory creation/locking/unlocking/cleaning to a single class. Don't release the channel until the lock is released. Refactor code to make use of new class
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Ismael Juma, Guozhang Wang
Closes#1628 from dguy/kafka-3812
guozhangwang enothereska please review
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#1565 from dguy/kafka-3912
https://issues.apache.org/jira/browse/KAFKA-3922
KAFKA-3922 add copy-constructor to AbstractStream class
This copy-constructor allow to access protected variables from subclasses.
It should be used to extend KStreamImpl and KTableImpl classes by implementing a decorator pattern.
Author: Florian Hussonnois <florian.hussonnois@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1581 from fhussonnois/KAFKA-3922
Mark all public `TopologyBuilder` methods as synchronized as they can modify data-structures and these methods could be called from multiple threads
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1633 from dguy/kafka-3855
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Geoff Anderson, Guozhang Wang, Ismael Juma
Closes#1621 from enothereska/simple-benchmark-streams-system-tests
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#1526 from enothereska/expose-names-dsl
Also move the initialization that restores from changelog to inner stores.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska, Dan Norwood
Closes#1610 from guozhangwang/K3941-avoid-eviction-listener
Also made a pass over the streams unit tests, with the following changes:
1. Removed three integration tests as they are already covered by other integration tests.
2. Merged `KGroupedTableImplTest` into `KTableAggregateTest`.
3. Use mocks whenever possible to reduce code duplicates.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1604 from guozhangwang/Kminor-unit-tests-consolidation
It was previously only deleting files/folders where the path started with /tmp. Changed it to delete from the value of the System Property `java.io.tmpdir`. Also changed the tests that were creating State dirs under /tmp to just use `TestUtils.tempDirectory(..)`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1600 from dguy/kafka-3942
Author: Nafer Sanabria <nafr.snabr@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1595 from naferx/minor-typo
Also handle Null value in SmokeTestUtil.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>
Closes#1597 from guozhangwang/KHotfix-check-null
Minor changes to check null changes.
Author: Jeyhun Karimov <je.karimov@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1591 from jeyhunkarimov/KAFKA-3836
…int to the console.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#1577 from bbejeck/KAFKA-3794-add-prefix-to-print-functions
`Windows.segments(...)` and `Windows.until(...)` currently aren't returning the `Window` with its type param `W`. This causes the generic type to be lost and therefore methods using this can't infer the correct return types.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1587 from dguy/windows-generics
The `KStream.groupBy(..)` calls don't change the value, only the key, so they don't need the type param `V1` as the new stream will always be of type `KStream<K1, V>`.
The `Serde` in the overloaded `groupBy` should have a type param of `V` to match the returned `KStream`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1584 from dguy/kstream-generics
follow-up to auto-through feature:
- add sourceNode to transform()
- enable auto-repartitioning in merge()
- null check not required anymore (always join-able due to auto-through)
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1580 from mjsax/hotfix
The contribution is my original work and that I license the work to the project under the project's open source license.
Contributors: Guozhang Wang, Phil Derome
guozhangwang
Added checkEmpty to validate processor does nothing and added a inhibit check for filter to fix issue.
Author: Philippe Derome <phderome@gmail.com>
Author: Phil Derome <phderome@gmail.com>
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1556 from phderome/DEROME-3902
This is the part I of the work to add the StreamsConfig to ProcessorContext.
We need to access StreamsConfig in the ProcessorContext so other components (e.g. RocksDBWindowStore or LRUCache can retrieve config parameter from application)
Author: Henry Cai <hcai@pinterest.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1553 from HenryCaiHaiying/config
Current task assignment in TaskAssignor is not deterministic.
During cluster restart or rolling restart, we have the same set of participating worker nodes. But the current TaskAssignor is not able to maintain a deterministic mapping, so about 20% partitions will be reassigned which would cause state repopulation.
When the topology of work nodes (# of worker nodes, the TaskIds they are carrying with) is not changed, we really just want to keep the old task assignment.
Add the code to check whether the node topology is changing or not:
- when the prevAssignedTasks from the old clientStates is the same as the new task list
- when there is no new node joining (its prevAssignTasks would be either empty or conflict with some other nodes)
- when there is no node dropping out (the total of prevAssignedTasks from other nodes would not be equal to the new task list)
When the topology is not changing, we would just use the old mapping.
I also added the code to check whether the previous assignment is balanced (whether each node's task list is within [1/2 average -- 2 * average]), if it's not balanced, we will still start the a new task assignment.
Author: Henry Cai <hcai@pinterest.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1543 from HenryCaiHaiying/upstream
ijuma i checked the cases where this test has failed and it seems to always be on the verification of the left join. I've ran this test plenty of times and i can't get it to fail. However in the interest of having stable builds, i've removed just the part of the test that is failing (which happens to be the last verification).
Thanks,
Damian
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1549 from dguy/kafka-3896
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>
Closes#1550 from guozhangwang/Kminor-grouppartitioner-javadoc
…stUtils, added method for pausing tests to TestUtils
Changes made:
1. Added utility method for creating consumer configs.
2. Added methods for creating producer, consumer configs with default values for de/serializers.
3. Pulled out method for waiting for test state to TestUtils (not using Thread.sleep).
4. Added utility class for creating streams configs and methods providing default de/serializers.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1532 from bbejeck/KAFKA_3842_add_helper_functions_test_utils
The method `RocksDB.open` assumes an absolute file path. If a relative path is configured, it leads to an exception like the following:
```
org.apache.kafka.streams.errors.ProcessorStateException: Error opening store CustomerIdToUserIdLookup at location ./tmp/rocksdb/CustomerIdToUserIdLookup
at org.rocksdb.RocksDB.open(Native Method)
at org.rocksdb.RocksDB.open(RocksDB.java:183)
at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:214)
at org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:165)
at org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:170)
at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:85)
at org.apache.kafka.test.KStreamTestDriver.<init>(KStreamTestDriver.java:64)
at org.apache.kafka.test.KStreamTestDriver.<init>(KStreamTestDriver.java:50)
at com.simple.estuary.transform.streaming.CartesianTransactionEnrichmentJobTest.testBuilder(CartesianTransactionEnrichmentJobTest.java:41)
```
Is there any risk to always fetching the absolute path as proposed here?
Let me know if you think this requires a JIRA issue or a unit test. I started working on a unit test, but don't know of a great solution for writing out a file to a relative directory.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1481 from jklukas/rocksdb-abspath