Adds flatTrasformValues methods in KStream
Adds processor supplier and processor for flatTransformValues
Improves API documentation of transformValues
Reviewers: Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Detailed description
* Adds KTable.suppress to the Scala API.
* Fixed count in KGroupedStream, SessionWindowedKStream, and TimeWindowedKStream so that the value serde gets passed down to the KTable returned by the internal mapValues method.
* Suppress API support for Java 1.8 + Scala 2.11
Testing strategy
I added unit tests covering:
* Windowed KTable.count.suppress w/ Suppressed.untilTimeLimit
* Windowed KTable.count.suppress w/ Suppressed.untilWindowCloses
* Non-windowed KTable.count.suppress w/ Suppressed.untilTimeLimit
* Session-windowed KTable.count.suppress w/ Suppressed.untilWindowCloses
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Due to KAFKA-8159, Streams will throw an unchecked exception when a caching layer or in-memory underlying store is queried over a range of keys from negative to positive. We should add a check for this and log it then return an empty iterator (as the RocksDB stores happen to do) rather than crash
Reviewers: Bruno Cadonna <bruno@confluent.io> Bill Bejeck <bbejeck@gmail.com>
Since we now call poll during restore, we can decrease the timeout
to a reasonable value, which should help Streams make progress if
threads get stuck.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Since we've added Kafka Streams optimizations in 2.1 we need to move the optimization for source KTable nodes (use source topic as changelog) to the optimization framework.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
This PR is a follow-up of #6174, which handles doFilter / doMapValues / doTransformValues methods.
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
The goal of this task is to implement an integration test for the kafka stream metrics.
We have to check 2 things:
1. After streams application are started, all metrics from different levels (thread, task, processor, store, cache) are correctly created and displaying recorded values.
2. When streams application are shutdown, all metrics are correctly de-registered and removed.
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Sub-task required to allow to define custom processor names with KStreams DSL(KIP-307) :
- add new public interface NamedOperation
- deprecate methods Joined.as() and Joined.name()
- update Suppredded interface to extend NamedOperation
Reviewers: Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Use concurrent data structure for the underlying cache in NamedCache, and iterate over it with subMap instead of many calls to get()
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
In a previous commit #6091, we've fixed a couple of edge cases and hence do not need to remove state listener anymore (before that we removed the state listener intentionally to avoid some race conditions, which has been gone for now).
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
TopicDescription and ConsumerGroupDescription in org.apache.kafka.clients.admin. are part of the public API, so we should retain the existing public constructor. Changed the new constructor with authorized operations to be package-private to avoid maintaining more public constructors since we only expect admin client to use this.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
The Java API can pass a Properties object to StreamsBuilder#build, to allow, e.g., topology optimization, while the Scala API does not yet. The latter only delegates the work to the underlying Java implementation.
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
This is a draft cleanup for KAFKA-7502. Here is the details:
* Make KTableKTableJoinNode abstract, and define its child classes ([NonMaterialized,Materialized]KTableKTableJoinNode) instead: now, all materialization-related routines are separated into the other classes.
* KTableKTableJoinNodeBuilder#build now instantiates [NonMaterialized,Materialized]KTableKTableJoinNode classes instead of KTableKTableJoinNode.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Just a minor cleanup to use Java 8 lambdas vs anonymous classes in this test.
I ran all tests in the streams test suite
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
* add a normal windowed suppress with short windows and a short grace
period
* improve the smoke test so that it actually verifies the intended
conditions
See https://issues.apache.org/jira/browse/KAFKA-7944
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
If deprecated interface methods are inherited, the @Deprication tag should be used (instead on suppressing the deprecation warning).
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Add TimestampedWindowStore builder/runtime classes
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
As of 2.0, Producer.initTransactions may throw a TimeoutException, which is retriable. Streams should retry instead of crashing when we encounter this exception
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
Metadata may be updated from the background thread, so we need to protect access to SubscriptionState. This patch restructures the metadata handling so that we only check pattern subscriptions in the foreground. Additionally, it improves the following:
1. SubscriptionState is now the source of truth for the topics that will be fetched. We had a lot of messy logic previously to try and keep the the topic set in Metadata consistent with the subscription, so this simplifies the logic.
2. The metadata needs for the producer and consumer are quite different, so it made sense to separate the custom logic into separate extensions of Metadata. For example, only the producer requires topic expiration.
3. We've always had an edge case in which a metadata change with an inflight request may cause us to effectively miss an expected update. This patch implements a separate version inside Metadata which is bumped when the needed topics changes.
4. This patch removes the MetadataListener, which was the cause of https://issues.apache.org/jira/browse/KAFKA-7764.
Reviewers: David Arthur <mumrah@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Resolves the compiler warnings when building streams-scala.
Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Improve JavaDocs about global state stores.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Rewrote the InMemoryWindowStore implementation by moving the work of a fetch to the iterator, and cleaned up the iterators as well.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Moved hard-coded 'expired-window-record-drop' and 'late-record-drop' to static Strings in StreamsMetricsImpl
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
Third (and final) PR in series to inline the generic parameters of the following bytes stores:
[Pt. I] InMemoryKeyValueStore
[Pt. II] RocksDBWindowStore
[Pt. II] RocksDBSessionStore
[Pt. II] MemoryLRUCache
[Pt. II] MemoryNavigableLRUCache
[x] InMemoryWindowStore
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>