We need to close producer first before closing tasks to make sure all messages are acked and hence checkpoint offsets are updated before closing tasks and their state. It was re-ordered mistakenly before.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Yasuhiro Matsuda <yasuhiro@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#845 from guozhangwang/KStreamState
Changes include:
1) Move logging logic from MeteredXXXStore to internal stores, and leave WindowedStore API clean by removed all internalPut/Get functions.
2) Wrap common logging behavior of InMemory and LRUCache stores into one class.
3) Fix a bug for StoreChangeLogger where byte arrays are not comparable in HashSet by using a specified RawStoreChangeLogger.
4) Add a caching layer on top of RocksDBStore with object caching, it relies on the object's equals and hashCode function to be consistent with serdes.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Yasuhiro Matsuda <yasuhiro@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#826 from guozhangwang/K3060
guozhangwang
removing an unused class, FilteredIterator, and its test.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Gwen Shapira
Closes#816 from ymatsuda/remove_obsolete_class
It behaves better on Windows and provides more useful error messages.
Also:
* Minor inconsistency fix in `kafka.server.OffsetCheckpoint`.
* Remove delete from `streams.state.OffsetCheckpoint` constructor (similar to the change in `kafka.server.OffsetCheckpoint` in 836cb19633 (diff-2503b32f29cbbd61ed8316f127829455L29)).
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#771 from ijuma/kafka-3105-use-atomic-move-with-fallback-instead-of-rename
guozhangwang
When ```WindowedSerializer``` is specified in ```to(...)``` or ```through(...)``` for a key, we use ```WindowedStreamPartitioner```.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#779 from ymatsuda/partitioner
Added option to use custom partitioning logic within each topology sink.
Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Guozhang Wang
Closes#309 from rhauch/kafka-2649
guozhangwang
An implementation of local store for join window. This implementation uses "rolling" of RocksDB instances for timestamp based truncation.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#726 from ymatsuda/windowed_join
guozhangwang
At DAG level, `KTable<K,V>` sends (key, (new value, old value)) to down stream. This is done by wrapping the new value and the old value in an instance of `Change<V>` class and sending it as a "value" part of the stream. The old value is omitted (set to null) by default for optimization. When any downstream processor needs to use the old value, the framework should enable it (see `KTableImpl.enableSendingOldValues()` and implementations of `KTableProcessorSupplier.enableSensingOldValues()`).
NOTE: This is meant to be used by aggregation. But, if there is a use case like a SQL database trigger, we can add a new KTable method to expose this.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#672 from ymatsuda/trigger
guozhangwang
* a test for ktable state store creation
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#661 from ymatsuda/more_ktable_test
onurkaraman becketqin Do you have time to review this patch? It addresses the ticket that jjkoshy filed in KAFKA-2668.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Onur Karaman <okaraman@linkedin.com>, Joel Koshy <jjkoshy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#328 from lindong28/KAFKA-2668
guozhangwang
* fix ProcessorStateManager to use correct ktable partitions
* more ktable tests
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#635 from ymatsuda/more_ktable_test
guozhangwang
* added KTable API and impl
* added standby support for KTable
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#604 from ymatsuda/add_ktable
Changes made for using getBaseConsumerConfigs from StreamingConfig.getConsumerConfigs.
Author: bbejeck <bbejeck@gmail.com>
Author: Bill Bejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang
Closes#596 from bbejeck/KAFKA-2902-StreamingConfig-use-getBaseConsumerConfigs
Starting a KafkaStream was getting an error due to the fact that the TopologyBuilder.addSink method was not connecting the sink with it parent(s) processor/sources. Just needed to wire up the sink with it parent(s) in TopologyBuilder.addSink .
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang
Closes#572 from bbejeck/KAFKA-2872_kafka_stream_sink_not_connected_to_parent
guozhangwang
A restore consumer does not belong to a consumer group.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#543 from ymatsuda/no_group_for_restore_consumer
guozhangwang
An optimization which may reduce unnecessary poll for standby tasks.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#535 from ymatsuda/remove_empty_standby_task
guozhangwang
* added a new config param "num.standby.replicas" (the default value is 0).
* added a new abstract class AbstractTask
* added StandbyTask as a subclass of AbstractTask
* modified StreamTask to a subclass of AbstractTask
* StreamThread
* standby tasks are created by calling StreamThread.addStandbyTask() from onPartitionsAssigned()
* standby tasks are destroyed by calling StreamThread.removeStandbyTasks() from onPartitionRevoked()
* In addStandbyTasks(), change log partitions are assigned to restoreConsumer.
* In removeStandByTasks(), change log partitions are removed from restoreConsumer.
* StreamThread polls change log records using restoreConsumer in the runLoop with timeout=0.
* If records are returned, StreamThread calls StandbyTask.update and pass records to each standby tasks.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#526 from ymatsuda/standby_task
Fails when order of elements is incorrect
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Yasuhiro Matsuda
Closes#510 from granthenke/streams-test
guozhangwang
When the rebalance happens each consumer reports the following information to the coordinator.
* Client UUID (a unique id assigned to an instance of KafkaStreaming)
* Task ids of previously running tasks
* Task ids of valid local states on the client's state directory
TaskAssignor does the following
* Assign a task to a client which was running it previously. If there is no such client, assign a task to a client which has its valid local state.
* Try to balance the load among stream threads.
* A client may have more than one stream threads. The assignor tries to assign tasks to a client proportionally to the number of threads.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#497 from ymatsuda/task_assignment