As title states. We plan to merge this to both trunk and 2.3 if it could fix the stream system tests globally.
Reference implementation: #6673
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>
When we close a task and EOS is enabled we should always close the producer regardless if the task is in a zombie state (the broker fenced the producer) or not.
I've added tests that fail without this change.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Jason Gustafson <jason@confluent.io>
The implementation of KIP-258 broke the state store methods in KStreamTestDriver.
These methods were unused in this project, so the breakage was not detected.
Since this is an internal testing utility, and it was deprecated and partially removed in
favor of TopologyTestDriver, I opted to just complete the removal of the class.
Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Basic idea:
KTable-KTable join: set max(left-ts,right-ts) for result
#agg(...) (stream/table windowed/non-windowed): set max(ts1, ts2, ts3,...) of all input records that contribute to the aggregation result
for all stateless transformation: input-ts -> output-ts
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>, Andy Coates <andy@confluent.io>, Bill Bejeck <bbejeck@gmail.com
When users provide a name for operation via the Streams DSL, we need to increment the counter used for auto-generated names to make sure any operators downstream of a named operator still produce a compatible name.
This PR is a subset of #6411 by @fhussonnois. We need to merge this PR now because it covers cases when users name repartition topics or state stores.
Updated tests to reflect the counter produces expected number even when the user provides a name.
Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>
This upgrade exposes a number of new options, including the WriteBufferManager which -- along with existing TableConfig options -- allows users to limit the total memory used by RocksDB across instances. This can alleviate some cascading OOM potential when, for example, a large number of stateful tasks are suddenly migrated to the same host.
The RocksDB docs guarantee backwards format compatibility across versions
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>,
When choosing the next record to process, we should look at the head record's timestamp of each partition and choose the lowest rather than choosing the lowest of the partition's streamtime.
This change effectively makes RecordQueue return the timestamp of the head record rather than its streamtime. Streamtime is removed (replaced) from RecordQueue as it was only being tracked in order to choose the next partition to poll from.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
For session-windows, the result record should have the window-end timestamp as record timestamp.
Rebased to resolve merge conflicts. Removed unused classes TupleForwarder and ForwardingCacheFlushListener (replace with TimestampedTupleForwarder, SessionTupleForwarder, TimestampedCacheFlushListerner, and SessionCacheFlushListener)
Reviewers: John Roesler <john@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Consolidated the unit tests by having {RocksDB/InMemory}{Window/Session}StoreTest extend {Window/Session}BytesStoreTest. Besides some implementation-specific tests (eg involving segment maintenance) all tests were moved to the abstract XXXBytesStoreTest class. The test coverage now is a superset of the original test coverage for each store type.
The only difference made to existing tests (besides moving them) was to switch from list-based equality comparison to set based, in order to reflect that the stores make no guarantees regarding the ordering of records returned from a range fetch.
There are some implementation-specific tests that were left in the corresponding test class. The RocksDBWindowStoreTest, for example, had several tests pertaining to segments and/or the underlying filesystem. Another key difference is that the in-memory versions should delete expired records aggressively, while the RocksDB versions should only remove entirely expired segments.
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Following KIP-453, this PR adds a default close() method to the RocksDBConfigSetter interface and calls it when closing a store.
Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bruno Cadonna <bruno@confluent.io>
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bill@confluent.io>
* Fixed bug in Struct.equals where we returned prematurely and added tests
* Update RequestResponseTest to check that `equals` and `hashCode` of
the struct is the same after serialization/deserialization only when possible.
* Use `Objects.equals` and `Long.hashCode` to simplify code
* Removed deprecated usages of `JUnitTestSuite`
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Adds recording of close of a stand-by task to the task-closed metric
Adds unit tests to verify the recording
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>
Part of KIP-345 effort. The strategy is to extract user passed in group.instance.id config and pass it in with given thread-id (because consumer is currently per-thread level).
Reviewers: Guozhang Wang <wangguoz@gmail.com>
While working on consolidating the various store unit tests I uncovered some minor "bugs" in the in-memory stores (inconsistencies with the behavior as established by the RocksDB stores).
open iterators should be properly closed in the case the store is closed
fetch/findSessions should always throw NPE if key is null
window end time should be truncated at Long.MAX_VALUE rather than throw exception
(Verified in-memory stores pass all applicable rocksDB tests now, unified unit tests coming in another PR)
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Any RocksJava object that inherits from org.rocksdb.AbstractNativeReference must be closed explicitly in order to free up the memory of the backing C++ object. The BloomFilter extends RocksObject (which implements AbstractNativeReference) and should be also be closed in RocksDBStore#close to avoid leaking memory.
Reviewers: Bill Bejeck <bbejeck@gmail.com>
Fix two problems in Streams:
* Session windows expired prematurely (off-by-one error), since the window end is inclusive, unlike other windows
* Suppress duration for sessions incorrectly waited only the grace period, but session windows aren't closed until gracePeriod + sessionGap
Update the tests accordingly
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Details in the JIRA: https://issues.apache.org/jira/browse/KAFKA-8285
Basically we want to avoid sharing of atomic updates for thread id with multiple stream instances on one JVM.
Reviewers: Raoul de Haard, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
When processing multiple key-changing operations during the optimization phase a ConcurrentModificationException is possible.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
* Revert "MINOR: Add unit test for SerDe auto-configuration (#6610)"
This reverts commit 172fbb2dd55144e2e44777174f970b56768e1777.
* Revert "[KAFKA-3729] Auto-configure non-default SerDes passed alongside the topology builder (#6461)"
This reverts commit e56ebbffca57741d398283e46073ed4170f8f927.
The two merged PRs introduce a breaking change. Reverting to preserve backward compatibility. Jira ticket reopened.
Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <guozhang@confluent.io>
First pass at an in-memory session store implementation.
Reviewers: Simon Geisler, Damian Guy <damian@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
While going through the review of InMemorySessionStore I realized there is also some minor cleanup to be done for the other in-memory stores. This includes trivial changes such as removing unnecessary references to 'this' and moving collection initialization to the declaration. It also fixes some unsafe behavior (registering an iterator from inside its own constructor). In-memory window store iterator classes were made static and some instances of KeyValueIterator missing types were fixed across a handful of tests.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bruno Cadonna <bruno@confluent.io>
This is the first diff for the implementation of JoinGroup logic for static membership. The goal of this diff contains:
* Add group.instance.id to be unique identifier for consumer instances, provided by end user;
Modify group coordinator to accept JoinGroupRequest with/without static membership, refactor the logic for readability and code reusability.
* Add client side support for incorporating static membership changes, including new config for group.instance.id, apply stream thread client id by default, and new join group exception handling.
* Increase max session timeout to 30 min for more user flexibility if they are inclined to tolerate partial unavailability than burdening rebalance.
* Unit tests for each module changes, especially on the group coordinator logic. Crossing the possibilities like:
6.1 Dynamic/Static member
6.2 Known/Unknown member id
6.3 Group stable/unstable
6.4 Leader/Follower
The rest of the 345 change will be broken down to 4 separate diffs:
* Avoid kicking out members through rebalance.timeout, only do the kick out through session timeout.
* Changes around LeaveGroup logic, including version bumping, broker logic, client logic, etc.
* Admin client changes to add ability to batch remove static members
* Deprecate group.initial.rebalance.delay
Reviewers: Liquan Pei <liquanpei@gmail.com>, Stanislav Kozlovski <familyguyuser192@windowslive.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Streams previously flushed stores in the order of their registration, which is arbitrary. Because stores may forward values upon flush (as in cached state stores), we must flush stores in topological order.
Reviewers: Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <wangguoz@gmail.com>