Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Michael G. Noll, Damian Guy, Eno Thereska, Guozhang Wang
Closes#1529 from mjsax/kafka-3880-join-windows
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1533 from enothereska/KAFKA-3872-oom-integration-tests
... by
a) merging some for startup/shutdown efficiency.
b) use independent state dirs.
c) remove some tests that are covered elsewhere
guozhangwang ewencp - tests are running much quicker now, i.e, down to about 1 minute on my laptop (from about 2 - 3 minutes). There were some issues with state-dirs in some of the integration tests that was causing the shutdown of the streams apps to take a long time.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1525 from dguy/integration-tests
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax
Closes#1520 from guozhangwang/KHotfix-iter-hasNext-window-value-getter
- Check if DB is null before flushing or closing. In some cases, a state store is closed twice. This happens in `StreamTask.close()` where both `node.close()` and `super.close` (in `ProcessorManager`) are called in a sequence. If the user's processor defines a `close` that closes the underlying state store, then the second close will be redundant.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Andrés Gómez, Ismael Juma, Guozhang Wang
Closes#1485 from enothereska/KAFKA-3805-locks
guozhangwang enothereska mjsax miguno
If you get a chance can you please take a look at this. I've done the repartitioning in the join, but it results in 2 internal topics for each join. This seems like overkill as sometimes we wouldn't need to repartition at all, others just 1 topic, and then sometimes both, but I'm not sure how we can know that.
I'd also need to implement something similar for leftJoin, but again, i'd like to see if i'm heading down the right path or if anyone has any other bright ideas.
For reference - https://github.com/apache/kafka/pull/1453 - the previous PR
Thanks for taking the time and looking forward to getting some welcome advice :-)
Author: Damian Guy <damian.guy@gmail.com>
Author: Damian Guy <damian@continuum.local>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1472 from dguy/KAFKA-3561
This PR is the follow on to the closed PR #1410.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1477 from bbejeck/KAFKA-3443_streams_support_for_regex_sources
See https://issues.apache.org/jira/browse/KAFKA-3753
This contribution is my original work and I license the work to the project under the project's open source license.
cc guozhangwang kichristensen ijuma
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1486 from jklukas/kvstore-size
guozhangwang mjsax enothereska
Currently, Kafka Streams does not have a util to get access to the sequence number added to the key of windows state store changelogs. I'm interested in exposing it so the the contents of a changelog topic can be 1) inspected for debugging purposes and 2) saved to text file and loaded from text file
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1501 from theduderog/expose-seq-num
- Fixed the logic calculating the windows that are affected by a new …event in the case of hopping windows and a small overlap.
- Added a unit test that tests for the issue
Author: Tom Rybak <trybak@gmail.com>
Reviewers: Michael G. Noll, Matthias J. Sax, Guozhang Wang
Closes#1462 from trybak/bugfix/KAFKA-3784-TimeWindows#windowsFor-false-positives
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1422 from enothereska/minor-integration-timeout2
Initially proposed by ijuma in https://github.com/apache/kafka/pull/1362#issuecomment-218293662
mjsax commented:
> StreamThread.close() should be extended to call metrics.close() (the class need a private member to reference the Metrics object, too)
The `Metrics` instance is created in the `KafkaStreams` constructor and shared between all threads, so closing it within the threads doesn't seem like the right approach. This PR calls `Metrics.close()` in `KafkaStreams.close()` instead.
cc guozhangwang
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1379 from jklukas/close-streams-metrics
I removed the hamcrest matcher to unbreak the build, but we probably want to tweak the `import-control.xml` as it currently only allows it for `<subpackage name="integration">`, which is weird.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1380 from ijuma/fix-streams-config-test-checkstyle
This is an improved version of https://github.com/apache/kafka/pull/1374, where we include a unit test.
/cc ijuma and guozhangwang
Author: Guozhang Wang <wangguoz@gmail.com>
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Michael G. Noll <michael@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1377 from miguno/streamsconfig-multiple-bootstrap-servers
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Michael G. Noll <michael@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1311 from guozhangwang/K3639
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma, Michael G. Noll, Guozhang Wang
Closes#1285 from enothereska/more-integration-tests
Fixes wrong KeyValue equals logic when keys not equal but values equal.
Original hotfix PR at https://github.com/apache/kafka/pull/1293 (/cc enothereska)
Please review: ewencp ijuma guozhangwang
Author: Eno Thereska <eno.thereska@gmail.com>
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Michael G. Noll <michael@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1294 from miguno/KeyValue-equality-hotfix
- add class doc for KTable, KStream, JoinWindows
- add missing return tags
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Michael G. Noll <michael@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1287 from mjsax/kafka-3440-JavaDoc
This PR includes the same code as https://github.com/apache/kafka/pull/1261 but is rebased on latest trunk.
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1277 from miguno/KAFKA-3613-v2
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma, Damian Guy, Michael G. Noll, Guozhang Wang
Closes#1260 from enothereska/KAFKA-3612-integration-tests
guozhangwang
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1272 from dguy/kstreamimpl-to-npe and squashes the following commits:
49d48fb [Damian Guy] actually commit the fix
07ce589 [Damian Guy] fix npe in KStreamImpl.to(..)
74d396d [Damian Guy] fix npe in KStreamImpl.to(..)
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Michael G. Noll, Guozhang Wang
Closes#1266 from mjsax/kafka-3599-minorCodeCleanup
Feel free to review guozhangwang enothereska mjsax .
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Matthias J. Sax, Michael G. Noll, Eno Thereska
Closes#1262 from miguno/KAFKA-3614
This contribution is my original work and I license the work to the Kafka project under the project's open source license.
cc guozhangwang miguno ymatsuda
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Michael G. Noll <michael@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#1270 from jklukas/streams-doc-fix
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma, Josh Gruenberg, Michael G. Noll, Ewen Cheslack-Postava
Closes#1229 from guozhangwang/K3499
Kafka Streams seems to hold file handles on the `.lock` files for the state dirs, resulting in an explosion of filehandles over time. Running `lsof` shows the number of open filehandles on the `.lock` file increasing rapidly over time. In a separate test project, I reproduced the issue and determined that in order for the filehandle to be relinquished the `FileChannel` instance must be properly closed. Applying this patch seems to resolve the issue in my job.
Author: Greg Fodor <gfodor@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1267 from gfodor/bug/state-lock-filehandle-leak
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Michael G. Noll <michael@confluent.io>
Closes#1255 from dguy/kgroupedtable-count-test
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1246 from guozhangwang/K3589
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1231 from mjsax/kafka-3337-extact-key-selector-from-agg
For enums and other constant strings, use locale independent case conversions to enable comparisons to work regardless of the default locale.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Manikumar Reddy, Ismael Juma, Guozhang Wang, Gwen Shapira
Closes#1220 from rajinisivaram/KAFKA-3548
… With KStream the method selectKey was added to enable getting a key from values before perfoming aggregation-by-key operations on original streams that have null keys.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1222 from bbejeck/KAFKA-3430_allow_users_to_set_key_KTable_toStream
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1217 from granthenke/close-consumers
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#1203 from enothereska/KAFKA-3504-logcompaction