src-kafka

Commit Graph

Author	SHA1	Message	Date
khairy	df8918b90f	remove unnecessary semicolon (#9207 ) Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.com>	4 years ago
khairy	1552db7d4c	enhance copying array (#9209 ) Reviewers: Bill Bejeck <bbejeck@apache.org>	4 years ago
khairy	c180823bdf	init a list with only one element with singleton function to enhance perf (#9208 ) Reviewers: Bill Bejeck <bbejeck@apache.org>	4 years ago
leah	2194ccba5b	Adding reverse iterator usage for sliding windows processing (extending KIP-450) (#9239 ) Add a backwardFetch call to the window store for sliding window processing. While the implementation works with the forward call to the window store, using backwardFetch allows for the iterator to be closed earlier, making implementation more efficient. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
John Roesler	5ee3ecb1e3	MINOR: Add debug logs for StreamThread (#9267 ) Add debug logs to see when Streams calls poll, process, commit, etc. Reviewers: Walker Carlson <wcarlson@confluent.io>, Guozhang Wang <guozhang@apache.org>	4 years ago
leah	dd2b9eca5d	KAFKA-5636: Improve handling of "early" records in sliding windows (#9157 ) Update for KIP-450 to handle "early" records. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	4 years ago
John Roesler	09d1498e30	KAFKA-10436: Implement KIP-478 Topology changes (#9221 ) Convert Topology#addProcessor and #addGlobalStore Also, convert some of the internals in support of addProcessor Reviewers: Bill Bejeck <bbejeck@apache.org>	4 years ago
John Roesler	9c8501f5f0	MINOR: Record all poll invocations (#9234 ) Record the pollSensor after every invocation to poll, rather than just when we get records back so that we can accurately gauge how often we're invoking Consumer#poll. Reviewers: Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <guozhang@apache.org>, Matthias J. Sax <mjsax@apache.org>	4 years ago
Bruno Cadonna	1e31354557	KAFKA-10355: Throw error when source topic was deleted (#9191 ) Before this commit, Kafka Streams would gracefully shut down the whole application when a source topic is deleted. The graceful shutdown does not give the user the possibility to react on the deletion of the source topic in the uncaught exception handler. This commit changes this behavior and throws an error when a source topic is deleted. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <guozhang@apache.org>, John Roesler <vvcephei@apache.org>	4 years ago
Bruno Cadonna	c04000cab1	KAFKA-9924: Add remaining property-based RocksDB metrics as described in KIP-607 (#9232 ) This commit adds the remaining property-based RocksDB metrics as described in KIP-607, except for num-entries-active-mem-table, which was added in PR #9177. Reviewers: Guozhang Wang <wangguoz@gmail.com>	4 years ago
leah	24b03a688f	MINOR: Fix message count for sliding windows test (#9248 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
Jorge Esteban Quilcate Otoya	4f06d9e7d0	KAFKA-9929: Support backward iterator on WindowStore (#9138 ) Implements KIP-617 on WindowStore that depends on #9137. Testing strategy: extend existing tests to validate reverse operations are supported. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	4 years ago
Yuriy Badalyantc	4662ed4aac	MINOR: Fix build scala 2.12 build after KAFKA-10020 (#9245 ) Fixes a problem in which the Serdes class in the same package as the tests (the old one) overshadows the one we explicitly imported (the new one), but only in Scala 2.12. Since users (hopefully) don't put their classes in our packages, they won't face the same problem. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>, John Roesler <vvcephei@apache.org>	4 years ago
Yuriy Badalyantc	d0111d3bcd	KAFKA-10020: Create a new version of a scala Serdes without name clash (KIP-616) (#8955 ) Wildcard import of the old org.apache.kafka.streams.scala.Serdes leads to a name clash because some of implicits has the same names as types from the scala's std lib. The new oak.streams.scala.serialization.Serdes is the same as the old Serdes, but without name clashes. The old one is marked as deprecated. Also, add missing serdes for UUID, ByteBuffer and Short types in the new Serdes. Implements: KIP-616 Reviewers: John Roesler <vvcephei@apache.org>	4 years ago
leah	85b6545b81	KAFKA-5636: SlidingWindows (KIP-450) (#9039 ) Add SlidingWindows API, implementation, and tests. An edge case and an optimization are left to follow-on work. Implements: KIP-450 Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <mjsax@apache.org>, John Roesler <vvcephei@apache.org>	4 years ago
Bruno Cadonna	9da32b6bd0	KAFKA-9924: Add RocksDB metric num-entries-active-mem-table (#9177 ) * Add the first RocksDB metric that exposes a RocksDB property: num-entries-active-mem-table. * Add code StreamsMetricsImpl in support of exposing RocksDB properties * unit tests and intergration tests This commit only contains one metric to keep the PR at a reasonable size. All other RocksDB metrics described in KIP-607 will be added in other PRs. Implements: KIP-607 Reviewers: Guozhang Wang <guozhang@apache.org>, John Roesler <vvcephei@apache.org>	4 years ago
Matthias J. Sax	d2c978c98e	MINOR: fix JavaDoc (#9217 ) Reviewer: John Roesler <john@confluent.io>	4 years ago
A. Sophie Blee-Goldman	22bcd9fac3	KAFKA-10054: KIP-613, add TRACE-level e2e latency metrics (#9094 ) Adds avg, min, and max e2e latency metrics at the new TRACE level. Also adds the missing avg task-level metric at the INFO level. I think where we left off with the KIP, the TRACE-level metrics were still defined to be "stateful-processor-level". I realized this doesn't really make sense and would be pretty much impossible to define given the DFS processing approach of Streams, and felt that store-level metrics made more sense to begin with. I haven't updated the KIP yet so I could get some initial feedback on this Reviewers: Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	4 years ago
Mickael Maison	8af7b96bfb	KAFKA-10367: Allow running the Streams demo app with a config file (#9131 ) Update the 3 WordCount demos to accept a configuration file. Reviewers: Matthias J. Sax <mjsax@apache.org>	4 years ago
Jorge Esteban Quilcate Otoya	89d06780a0	KAFKA-9929: Support reverse iterator on KeyValueStore (#9137 ) Add new methods to KeyValueStore interfaces to support reverse iteration. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
serjchebotarev	567d76c7f8	KAFKA-10035: Safer conversion of consumer timeout parameters (#9028 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	4 years ago
John Roesler	88d4bc4641	KAFKA-10379: Implement the KIP-478 StreamBuilder#addGlobalStore() (#9148 ) From KIP-478, implement the new StreamBuilder#addGlobalStore() overload that takes a stateUpdateSupplier fully typed Processor<KIn, VIn, Void, Void>. Where necessary, use the adapters to make the old APIs defer to the new ones, as well as limiting the scope of this change set. Reviewers: Boyang Chen <boyang@apache.org>	4 years ago
A. Sophie Blee-Goldman	0779846901	KAFKA-10395: relax output topic check in TTD to work with dynamic routing (#9174 ) Reviewers: Boyang Chen <boyang@apache.org>, Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
albert02lowis	153db48866	KAFKA-9273: Extract testShouldAutoShutdownOnIncompleteMetadata from S… (#9108 ) The main goal is to remove usage of embedded broker (EmbeddedKafkaCluster) in AbstractJoinIntegrationTest and its subclasses. This is because the tests under this class are no longer using the embedded broker, except for two. testShouldAutoShutdownOnIncompleteMetadata is one of such tests. Furthermore, this test does not actually perfom stream-table join; it is testing an edge case of joining with a non-existent topic, so it should be in a separate test. Testing strategy: run existing unit and integration test Reviewers: Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.org>	4 years ago
Bruno Cadonna	5645d906fa	KAFKA-9924: Prepare RocksDB and metrics for RocksDB properties recording (#9098 ) Refactor the RocksDB store and the metrics infrastructure in Streams in preparation of the recordings of the RocksDB properties specified in KIP-607. The refactoring includes: * wrapper around BlockedBasedTableConfig to make the cache accessible to the RocksDB metrics recorder * RocksDB metrics recorder now takes also the DB instance and the cache in addition to the statistics * The value providers for the metrics are added to the RockDB metrics recorder also if the recording level is INFO. * The creation of the RocksDB metrics recording trigger is moved to StreamsMetricsImpl Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <vvcephei@apache.org>	4 years ago
Guozhang Wang	d0800b3f7c	KAFKA-10391: Overwrite checkpoint in task corruption to remove corrupted partitions (#9170 ) In order to do this, I also removed the optimization such that once enforced checkpoint is set to true, we always checkpoint unless the state stores are not initialized at all (i.e. the snapshot is null). Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <ableegoldman@gmail.com>	4 years ago
Guozhang Wang	7915d5e5f8	KAFKA-9450: Decouple flushing state from commiting (#8964 ) In Kafka Streams the source-of-truth of a state store is in its changelog, therefore when committing a state store we only need to make sure its changelog records are all flushed and committed, but we do not actually need to make sure that the materialized state have to be flushed and persisted since they can always be restored from changelog when necessary. On the other hand, flushing a state store too frequently may have side effects, e.g. rocksDB flushing would gets the memtable into an L0 sstable, leaving many small L0 files to be compacted later, which introduces larger overhead. Therefore this PR decouples flushing from committing, such that we do not always flush the state store upon committing, but only when sufficient data has been written since last time flushed. The checkpoint file would then also be overwritten only along with flushing the state store indicating its current known snapshot. This is okay since: a) if EOS is not enabled, then it is fine if the local persisted state is actually ahead of the checkpoint, b) if EOS is enabled, then we would never write a checkpoint file until close. Here's a more detailed change list of this PR: 1. Do not always flush state stores when calling pre-commit; move stateMgr.flush into post-commit to couple together with checkpointing. 2. In post-commit, we checkpoint when: a) The state store's snapshot has progressed much further compared to the previous checkpoint, b) When the task is being closed, in which case we enforce checkpointing. 3. There are some tricky obstacles that I'd have to work around in a bit hacky way: for cache / suppression buffer, we still need to flush them in pre-commit to make sure all records sent via producers, while the underlying state store should not be flushed. I've decided to introduce a new API in CachingStateStore to be triggered in pre-commit. I've also made some minor changes piggy-backed in this PR: 4. Do not delete checkpoint file upon loading it, and as a result simplify the checkpointNeeded logic, initializing the snapshotLastFlush to the loaded offsets. 5. In closing, also follow the commit -> suspend -> close ordering as in revocation / assignment. 6. If enforceCheckpoint == true during RUNNING, still calls maybeCheckpoint even with EOS since that is the case for suspending / closing. Reviewers: John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	4 years ago
leah	3e6dcb14dd	MINOR: Improve checks for CogroupedStreamAggregateBuilder (#9141 ) Update `CogroupedStreamAggregateBuilder` to have individual builders depending on the windowed aggregation, or lack thereof. This replaced passing in all options into the builder, with all but the current type of aggregation set to null and then checking to see which value was not null. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
Bruno Cadonna	c8bccdd913	MINOR: Fix state transition diagram for stream threads (#9153 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	4 years ago
John Roesler	92828d53b1	KAFKA-10261: Introduce the KIP-478 apis with adapters (#9004 ) Adds the new Processor and ProcessorContext interfaces as proposed in KIP-478. To integrate in a staged fashion with the code base, adapters are included to convert back and forth between the new and old APIs. ProcessorNode is converted to the new APIs. Reviewers: Boyang Chen <boyang@confluent.io>	4 years ago
Matthias J. Sax	990301323c	KAFKA-9274: Remove `retries` from InternalTopicManager (#9060 ) - part of KIP-572 - replace `retries` in InternalTopicManager with infinite retires plus a new timeout, based on consumer config MAX_POLL_INTERVAL_MS Reviewers: David Jacot <djacot@confluent.io>, Boyang Chen <boyang@confluent.io>	4 years ago
John Thomas	e7316f35d9	KAFKA-10316: Consider renaming getter method for Interactive Queries (#9120 ) - implements KIP-648 - Deprecated the existing getters and added new getters without `get` prefix to `KeyQueryMetadata` Co-authored-by: johnthotekat <Iabon1989*> Reviewers: Navinder Pal Singh Brar <navinder_brar@yahoo.com>, Matthias J. Sax <matthias@confluent.io>	4 years ago
Matthias J. Sax	b351493543	KAFKA-9274: Remove `retries` for global task (#9047 ) - part of KIP-572 - removed the usage of `retries` in `GlobalStateManger` - instead of retries the new `task.timeout.ms` config is used Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	4 years ago
John Roesler	26a217c8e7	MINOR: Streams integration tests should not call exit (#9067 ) - replace System.exit with Exit.exit in all relevant classes - forbid use of System.exit in all relevant classes and add exceptions for others Co-authored-by: John Roesler <vvcephei@apache.org> Co-authored-by: Matthias J. Sax <matthias@confluent.io> Reviewers: Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@confluent.io>	4 years ago
Randall Hauch	1112fd4723	KAFKA-10341: Add 2.6.0 to system tests and streams upgrade tests (#9116 ) Author: Randall Hauch <rhauch@gmail.com> Reviewer: Matthias J. Sax <matthias@confluent.io>	4 years ago
A. Sophie Blee-Goldman	b06a603689	MINOR: change Streams integration test log levels (#9024 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	4 years ago
Bruno Cadonna	d15cbf1ccc	HOTFIX: Set session timeout and heartbeat interval to default to decrease flakiness (#9087 ) Set session timeout and heartbeat interval to default for RestoreIntegrationTest Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	4 years ago
Bruno Cadonna	96e0719e42	KAFKA-10319: Skip unknown offsets when computing sum of changelog offsets (#9066 ) (#9097 ) In PR #8962 we introduced a sentinel UNKNOWN_OFFSET to mark unknown offsets in checkpoint files. The sentinel was set to -2 which is the same value used for the sentinel LATEST_OFFSET that is used in subscriptions to signal that state stores have been used by an active task. Unfortunately, we missed to skip UNKNOWN_OFFSET when we compute the sum of the changelog offsets. If a task had only one state store and it did not restore anything before the next rebalance, the stream thread wrote -2 (i.e., UNKNOWN_OFFSET) into the subscription as sum of the changelog offsets. During assignment, the leader interpreted the -2 as if the stream run the task as active although it might have run it as standby. This misinterpretation of the sentinel value resulted in unexpected task assignments. Ports: KAFKA-10287 / #9066 Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, John Roesler <vvcephei@apache.org>, Matthias J. Sax <mjsax@apache.org>	4 years ago
Tom Bentley	819cd454f9	KAFKA-10120: Deprecate DescribeLogDirsResult.all() and .values() (#9007 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <djacot@confluent.io>, Lee Dongjin <dongjin@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>	4 years ago
Boyang Chen	2f64f6deb9	KAFKA-10321: fix infinite blocking for global stream thread startup (#9095 ) The start() function for global stream thread only checks whether the thread is not running, as it needs to block until it finishes the initialization. This PR fixes this behavior by adding a check whether the thread is already in error state as well. Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <vvcephei@apache.org>	4 years ago
Matthias J. Sax	0d47c69a93	KAFKA-10306: GlobalThread should fail on InvalidOffsetException (#9075 ) * KAFKA-10306: GlobalThread should fail on InvalidOffsetException * Update streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateUpdateTask.java Co-authored-by: John Roesler <vvcephei@users.noreply.github.com> * Update streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateUpdateTask.java Co-authored-by: John Roesler <vvcephei@users.noreply.github.com> * Update streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStreamThread.java Co-authored-by: John Roesler <vvcephei@users.noreply.github.com> * Update streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStreamThread.java Co-authored-by: John Roesler <vvcephei@users.noreply.github.com>	4 years ago
Vito Jeng	a4b923f76f	MINOR: code cleanup for `VOut` inconsistent naming (#8907 ) Consistently using VOut instead of Vout for Streams group API. Reviewers: Boyang Chen <boyang@confluent.io>	4 years ago
Ashish Roy	483fbd812d	KAFKA-10246 : AbstractProcessorContext topic() throws NPE (#9034 ) AbstractProcessorContext topic() throws NullPointerException when modifying a state store within the DSL from a punctuator. Reorder the check to avoid the NPE. Co-authored-by: Ashish Roy <v-ashish.r@turvo.com> Reviewers: Boyang Chen <boyang@confluent.io>	4 years ago
Matthias J. Sax	0ecce0d5da	MINOR: removed incorrect deprecation annotations (#9061 ) Reviewers: Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, John Roesler <john@confluent.io>	4 years ago
Vito Jeng	2d79171285	KAFKA-5876: Add new exception types for Interactive Queries (#8200 ) - part of KIP-216 - adds new sub-classes of InvalidStateStoreException Reviewers: Navinder Pal Singh Brar <navinder_brar@yahoo.com>, Matthias J. Sax <matthias@confluent.io>	4 years ago
Matthias J. Sax	194c56fce2	KAFKA-9274: Mark `retries` config as deprecated and add new `task.timeout.ms` config (#8864 ) - part of KIP-572 - deprecates producer config `retries` (still in use) - deprecates admin config `retries` (still in use) - deprecates Kafka Streams config `retries` (will be ignored) - adds new Kafka Streams config `task.timeout.ms` (follow up PRs will leverage this new config) Reviewers: John Roesler <john@confluent.io>, Jason Gustafson <jason@confluent.io>, Randall Hauch <randall@confluent.io>	4 years ago
A. Sophie Blee-Goldman	9b30276e58	KAFKA-9161: add docs for KIP-441 and KIP-613 and other configs that need fixing (#9027 ) Add docs for KIP-441 and KIP-613. Fixed some miscellaneous unrelated issues in the docs: * Adds some missing configs to the Streams config docs: max.task.idle.ms,topology.optimization, default.windowed.key.serde.inner.class, and default.windowed.value.serde.inner.class * Defines the previously-undefined default windowed serde class configs, including choosing a default (null) and giving them a doc string, so the yshould nwo show up in the auto-generated general Kafka config docs * Adds a note to warn users about the rocksDB bug that prevents setting a strict capacity limit and counting write buffer memory against the block cache Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	4 years ago
Guozhang Wang	715df0d271	MINOR: Improve log4j for per-consumer assignment (#8997 ) Add log4j entry summarizing the assignment (previous owned and assigned) at the consumer level. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>	4 years ago
Chia-Ping Tsai	ffdec02e25	KAFKA-10044 Deprecate ConsumerConfig#addDeserializerToConfig and Prod… (#9013 ) deprecate ConsumerConfig#addDeserializerToConfig and ProducerConfig#addSerializerToConfig. Create internal use cases instead: appendDeserializerToConfig and appendSerializerToConfig Reviewers: Boyang Chen <boyang@confluent.io>	4 years ago
John Roesler	cec5f377b5	KAFKA-10247: Correctly reset state when task is corrupted (#8994 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	4 years ago

1 2 3 4 5 ...

1881 Commits (ebd64b5d558f5500f419e7e66499ba9ffc30dc04)