Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2931 from mjsax/kafka-5140-flaky-reset-integration-test
a KStream.to() sink is also a topic
... so the KStreamTestDriver to fetch it when required
Author: Wim Van Leuven <wim.vanleuven@bigboards.io>
Author: Wim Van Leuven <wim.vanleuven@highestpoint.biz>
Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2716 from wimvanleuven/KAFKA-4927
This fix needs to be backported to 0.10.2 as well.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy, Ismael Juma, Guozhang Wang
Closes#2982 from enothereska/KAFKA-5174-1-core
Change fetchPrevious to use findSessions with the proper key and timestamps rather than using fetch.
Author: Kyle Winkelman <kyle.winkelman@optum.com>
Reviewers: Damian Guy, Guozhang Wang
Closes#2972 from KyleWinkelman/CachingSessionStore-fetchPrevious
The code was correct since the method is only called from
one thread, but the change is worthwhile anyway.
Author: Amit Daga <adaga@adobe.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2966 from amitdaga/findbugs-streams-multithread
Add new broker config, `group.initial.rebalance.delay.ms`, with a default of 3 seconds.
When a consumer creates a new group, set the group's state to InitialRebalance and delay the rebalance until `min(group.initial.rebalance.delay.ms, rebalanceTimeout)`. As other members join the group further delay the rebalance by `min(group.initial.rebalance.delay.ms, remainingRebalanceTimeout)`. Once `rebalanceTimeout` is hit or no new members join the group within the delay, complete the rebalance.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ewen Cheslack-Postava, Guozhang Wang
Closes#2758 from dguy/kafka-4925
…dSource
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2783 from bbejeck/HOTFIX_potentially_hanging_test_in_RegexSourceIntegrationTest
This is the implementation of KIP-114: KTable state stores and improved semantics:
- Allow for decoupling between querying and materialisation
- consistent APIs, overloads with queryableName and without
- depreciated several KTable calls
- new unit and integration tests
In this implementation, state stores are materialized if the user desires them to be queryable. In subsequent versions we can offer a second option, to have a view-like state store. The tradeoff then would be between storage space (materialize) and re-computation (view). That tradeoff can be exploited by later query optimizers.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Guozhang Wang
Closes#2832 from enothereska/KAFKA-5045-ktable
The descendingSubsequence is a misnomer. The linked list is actually arranged so that the lowest timestamp is first and larger timestamps are added to the end, therefore renamed to ascendingSubsequence.
The minElem variable was also misnamed. It's actually the current maximum element as it's taken from the end of the list.
Added comment to get() to make it clear it's returning the lowest timestamp.
Author: mihbor <mihbor@users.noreply.github.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2948 from mihbor/patch-4
- addressing open Github comments from #2773
- test clean-up
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Guozhang Wang
Closes#2854 from mjsax/kafka-4986-producer-per-task-follow-up
1. Added a flag to indicate if it is restoring or not in the LRU Store; since we only have a restore callback we have to set it each time applying the change.
2. Fixed the corresponding unit test, plus some minor cleaning up.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2908 from guozhangwang/K4379-remove-listener
- mainly moving methods
- also improved logging
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2917 from mjsax/kafka-5111-code-cleanup-follow-up
* Producer and Consumer `close` calls were not handled via `try-with-resources`
* `cleanRun` unused field removed
* Refactored handling of Consumer configuration in `IntegrationTestUtils` to ensure auto-committing of offsets and starting from `earliest`
* As a result reverted https://github.com/apache/kafka/pull/2921 since it's redundant now
Author: Armin Braun <me@obrown.io>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2920 from original-brownbear/cleanup-it-utils-closing
Fixes `org.apache.kafka.streams.integration.utils.IntegrationTestUtils#readKeyValues` potentially starting to `poll` for stream output after the stream finished sending the test data and hence missing it when working with `latest` offsets.
Author: Armin Braun <me@obrown.io>
Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2921 from original-brownbear/KAFKA-5124
Refactors Task with proper interface methods `init()`, `resume()`, `commit()`, `suspend()`, and `close()`. All other methods for task handling are internal now. This allows to simplify `StreamThread` code, avoid code duplication and allows for easier reasoning of control flow.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Damian Guy, Eno Thereska, Guozhang Wang
Closes#2895 from mjsax/kafka-5111-cleanup-task-code
- call close() on Metrics to join created threads
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska, Damian Guy, Guozhang Wang
Closes#2788 from mjsax/minor-improve-test-metric-cleanup
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2837 from mjsax/kafka-4564-fail-fast-test-stream-compatibility
If `partition==null` and `partitioner!=null` we should not fall back to default partitioner (as we do before the patch if `producer.partitionsFor(...)` returns empty list. Falling back to default partitioner might corrupt hash partitioning.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska, Damian Guy, Guozhang Wang
Closes#2868 from mjsax/minor-fix-RecordCollector
fix some spelling errors
Author: xinlihua <xin.lihua1@zte.com.cn>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2871 from auroraxlh/fix_spellingerror
Skip null keys when initializing GlobalKTables. This is inline with what happens during normal processing.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Michael G. Noll, Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2834 from dguy/kafka-5047
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy, Eno Thereska, Guozhang Wang
Closes#2789 from mjsax/minor-improve-integration-test
change `consumer.position` so that it always updates any partitions that need an update. Keep track of partitions that `seekToBeginning` in `StoreChangeLogReader` and do the `consumer.position` call after all `seekToBeginning` calls.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang, Jason Gustafson, Ismael Juma
Closes#2769 from dguy/kafka-4937
Enable producer per task if exactly-once config is enabled.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2773 from mjsax/exactly-once-streams-producer-per-task
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2848 from enothereska/KAFKA-5038-trunk
Set the internal consumer config internal.leave.group.on.close in
`StreamsConfig`. This is to reduce the number of rebalances we get
during bounces.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2750 from dguy/kafka-4965
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2751 from miguno/trunk-streams-window-iterator-doc-fixes
Author: Eno Thereska <eno@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2819 from enothereska/minor-increase-retries
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2780 from cmccabe/KAFKA-4995
Highlight that the range in `fetch` is inclusive of both `timeFrom` and `timeTo`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Michael G. Noll <michael@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2811 from dguy/minor-window-fetch-java-doc
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Eno Thereska <eno@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2812 from miguno/trunk-streams-examples-docs
We should catch `InvalidTopicException` and not just
`NoOffsetForPartitionException`. Also, we need to step through
all partitions that might be affected and reset those.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bbejeck@gmail.com>, Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2747 from mjsax/minor-fix-reset
There should only be a single `KafkaStreams.StreamStateListener` to
ensure synchronization of operations on
`KafkaStreams.StreamStateListener#threadState`.
Author: Armin Braun <me@obrown.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2801 from original-brownbear/fix-stream-state-listener
This fixes:
```
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at org.apache.kafka.streams.processor.internals.StateDirectoryTest.shouldCleanUpTaskStateDirectoriesThatAreNotCurrentlyLocked(StateDirectoryTest.java:145)
```
While running test in infinite loop, hit other problems:
- fixed file management (release all locks and close everything)
- increased sleep time for `shouldCleanupStateDirectoriesWhenLastModifiedIsLessThanNowMinusCleanupDelay` too (was flaky as well)
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2781 from mjsax/minor-fix-stateDirectoryTest
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2777 from mjsax/hotfix-window-serdes-trunk
Several fixes for handling broker failures:
- default replication value for internal topics is now 3 in test itself (not in streams code, that will require a KIP.
- streams producer waits for acks from all replicas in test itself (not in streams code, that will require a KIP.
- backoff time for streams client to try again after a failure to contact controller.
- fix bug related to state store locks (this helps in multi-threaded scenarios)
- fix related to catching exceptions property for network errors.
- system test for all the above
Author: Eno Thereska <eno@confluent.io>
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Dan Norwood <norwood@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2719 from enothereska/KAFKA-4916-broker-bounce-test
fixes:
```
java.nio.file.NoSuchFileException: /tmp/test7863510415433793941/topic2-Canonized/topic2-Canonized-197001010000/000015.sst
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:97)
at java.nio.file.Files.readAttributes(Files.java:1686)
at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:105)
at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:199)
at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:199)
at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:199)
at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:69)
at java.nio.file.Files.walkFileTree(Files.java:2602)
at java.nio.file.Files.walkFileTree(Files.java:2635)
at org.apache.kafka.common.utils.Utils.delete(Utils.java:555)
at org.apache.kafka.streams.kstream.internals.KStreamWindowAggregateTest.testJoin(KStreamWindowAggregateTest.java:320)
```
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#2778 from mjsax/minor-fix-kstreamWindowAggregateTest
This may be a reason why we see Jenkins jobs time out at times.
I can reproduce it locally.
With current trunk there is a possibility to run into this:
```sh
"kafka-streams-close-thread" #585 daemon prio=5 os_prio=0 tid=0x00007f66d052d800 nid=0x7e02 waiting for monitor entry [0x00007f66ae2e5000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.streams.processor.internals.StreamThread.close(StreamThread.java:345)
- waiting to lock <0x000000077d33c538> (a org.apache.kafka.streams.processor.internals.StreamThread)
at org.apache.kafka.streams.KafkaStreams$1.run(KafkaStreams.java:474)
at java.lang.Thread.run(Thread.java:745)
"appId-bd262a91-5155-4a35-bc46-c6432552c2c5-StreamThread-97" #583 prio=5 os_prio=0 tid=0x00007f66d052f000 nid=0x7e01 waiting for monitor entry [0x00007f66ae4e6000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.streams.KafkaStreams.setState(KafkaStreams.java:219)
- waiting to lock <0x000000077d335760> (a org.apache.kafka.streams.KafkaStreams)
at org.apache.kafka.streams.KafkaStreams.access$100(KafkaStreams.java:117)
at org.apache.kafka.streams.KafkaStreams$StreamStateListener.onChange(KafkaStreams.java:259)
- locked <0x000000077d42f138> (a org.apache.kafka.streams.KafkaStreams$StreamStateListener)
at org.apache.kafka.streams.processor.internals.StreamThread.setState(StreamThread.java:168)
- locked <0x000000077d33c538> (a org.apache.kafka.streams.processor.internals.StreamThread)
at org.apache.kafka.streams.processor.internals.StreamThread.setStateWhenNotInPendingShutdown(StreamThread.java:176)
- locked <0x000000077d33c538> (a org.apache.kafka.streams.processor.internals.StreamThread)
at org.apache.kafka.streams.processor.internals.StreamThread.access$1600(StreamThread.java:70)
at org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsRevoked(StreamThread.java:1321)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare(ConsumerCoordinator.java:406)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:349)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:296)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1037)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1002)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:531)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:669)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:326)
```
In a nutshell: `KafkaStreams` and `StreamThread` are both
waiting for each other since another intermittent `close`
(eg. from a test) comes along also trying to lock on
`KafkaStreams` :
```sh
"main" #1 prio=5 os_prio=0 tid=0x00007f66d000c800 nid=0x78bb in Object.wait() [0x00007f66d7a15000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
- locked <0x000000077d45a590> (a java.lang.Thread)
at org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:503)
- locked <0x000000077d335760> (a org.apache.kafka.streams.KafkaStreams)
at org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:447)
at org.apache.kafka.streams.KafkaStreamsTest.testCannotStartOnceClosed(KafkaStreamsTest.java:115)
```
=> causing a deadlock.
Fixed this by softer locking on the state change, that guarantees
atomic changes to the state but does not lock on the whole object
(I at least could not find another method that would require more
than atomicly-locked access except for `setState`).
Also qualified the state listeners with their outer-class to make
the whole code-flow around this more readable (having two
interfaces with the same naming for interface and method and then
using them between their two outer classes is crazy hard to read
imo :)).
Easy to reproduced yourself by running
`org.apache.kafka.streams.KafkaStreamsTest` in a loop for a bit
(save yourself some time by running 2-4 in parallel :)). Eventually
it will lock on one of the tests (for me this takes less than 1 min
with 4 parallel runs).
Author: Armin Braun <me@obrown.io>
Author: Armin <me@obrown.io>
Reviewers: Eno Thereska <eno@confluent.io>, Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2791 from original-brownbear/fix-streams-deadlock
Fix for adding state stores with regex defined sources
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang
Closes#2618 from bbejeck/KAFKA-4791_unable_to_add_statestore_regex_topics
We got test error `org.apache.kafka.common.errors.TopicExistsException: Topic 'inputTopic' already exists.` in some builds. Can reproduce reliably at local machine. Root cause it async "topic delete" that might not be finished before topic gets re-created.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Damian Guy, Guozhang Wang
Closes#2757 from mjsax/minor-fix-resetintegrationtest
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jun Rao <junrao@gmail.com>, Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2691 from cmccabe/KAFKA-4902