Fixes KAFKA-3857
Changes proposed in this pull request:
An additional log cleaner metric has been added:
time-since-last-run-ms: Time since the last log cleaner run, in milliseconds. This metric would be reset to 0 every time log cleaner thread runs. If this metric keeps constantly increasing, it indicates that the log cleaner thread is not alive.
If you are creating alerts around log cleaner, you could monitor this metric. A high "time-since-last-run-ms" value (eg: 600000) indicates that the log cleaner hasn't been running since the last 10 minutes.
The code has been tested. JMX metric has been verified.
Note: This pull request is a continuation of the following pull request. PR#1593 was quite old and I had some trouble rebasing it. Decided to start a fresh PR.
927b28cf41 (diff-ca1c127eee4b3c748ae73028f6abeab8)
Author: Kiran Pillarisetty <pillarisetty@tivo.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2378 from kiranptivo/log_cleaner_jmx_metric
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2074 from vahidhashemian/KAFKA-3853
This PR is extracted from https://github.com/apache/kafka/pull/2333 as an incremental fix to ease the reviewing:
1. Removed `storeToProcessorNodeMap` from ProcessorTopology since it was previously used to set the context current record, and can now be replaced with the dirty entry in the named cache.
2. Replaced `sourceStoreToSourceTopic` from ProcessorTopology with `storeToChangelogTopic` map, which includes the corresponding changelog topic name for all stores that are changelog enabled.
3. Modified `ProcessorStateManager` to rely on `sourceStoreToSourceTopic` when retrieving the changelog topic; this makes the second parameter `loggingEnabled` in `register` not needed any more, and we can deprecate the old API with a new one.
4. Also fixed a minor issue in `KStreamBuilder`: if the storeName is not provided in the `table(..)` function, do not create the underlying materialized store. Modified the unit tests to cover this case.
5. Fixed a bunch of other unit tests failures that are exposed by this refactoring, in which we are not setting the applicationId correctly when constructing the mocking processor topology.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Ewen Cheslack-Postava
Closes#2338 from guozhangwang/KMinor-refactor-state-to-changelogtopic
This way, if the ${KAFKA_NUM_CONTAINERS} is changed in docker/run_tests.sh, the json is still valid
Author: Emanuele Cesena <emanuele.cesena@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2370 from 0x0ece/patch-1
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira <cshapi@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#2354 from ijuma/kafka-4565-separation-of-internal-and-external-traffic
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2366 from rajinisivaram/KAFKA-4626
Validate and fail client connection if multiple login modules are specified in sasl.jaas.config to avoid harder-to-debug authentication failures later on.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2356 from rajinisivaram/KAFKA-4581
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2348 from hachikuji/minor-cleanup-for-kip-102
and remove some unnecessary SuppressWarnings annotations
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Ismael Juma, Guozhang Wang
Closes#2363 from xvrl/kip-100-followup
Besides API and runtime changes, this PR also includes 2 data transformations (`InsertField`, `HoistToStruct`) and 1 routing transformation (`TimestampRouter`).
There is some gnarliness in `ConnectorConfig` / `ConfigDef` around creating, parsing and validating a dynamic `ConfigDef`.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2299 from shikhar/smt-2017
Add Global Tables to KafkaStreams. Global Tables are fully replicated once-per instance of KafkaStreams. A single thread is used to update them. They can be used to join with KStreams, KTables, and other GlobalKTables. When participating in a join a GlobalKTable is only ever used to perform a lookup, i.e., it will never cause data to be forwarded to downstream processor nodes.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang
Closes#2244 from dguy/global-tables
KAFKA-4603 the argument of shell in doc wrong and command parsed error
In "7.6.2 Migrating clusters" of document security.html, the argument "--zookeeper.connection" of shell "zookeeper-security-migrat.sh" is wrong and the using of OptionParser is not correct
This patch corrected the doc and changed the OptionParser constructor
Author: auroraxlh <xin.lihua1@zte.com.cn>
Reviewers: Xi Hu <huxi_2b@hotmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2322 from auroraxlh/my-issue
- TimeWindows represent half-open time intervals while SessionWindows represent closed time intervals
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2342 from mjsax/kafka-3452-session-window-follow-up
mjsax
Here's my first pass at finer grained auto offset reset strategies.
I've left TODO comments about whether we want to consider adding this to `KGroupedTable.aggregate` and `KStreamImpl` when re-partitioning a source.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#2007 from bbejeck/KAFKA-4114_allow_different_offset_reset_strategies
Changed caching in LoginManager to allow one LoginManager per client
JAAS configuration.
Added test to End2EndAuthorization for SASL Plain and GSSAPI with two
consumers with different credentials.
Developed with mimaison.
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2293 from edoardocomar/KAFKA-4180d
Kafka Streams: add granular metrics per node and per task, also expose ability to register non latency metrics in StreamsMetrics
Also added different recording levels to Metrics.
This is joint contribution from Eno Thereska and Aarti Gupta.
from https://github.com/apache/kafka/pull/1362#issuecomment-218326690-------
We can consider adding metrics for process / punctuate / commit rate at the granularity of each processor node in addition to the global rate mentioned above. This is very helpful in debugging.
We can consider adding rate / total cumulated metrics for context.forward indicating how many records were forwarded downstream from this processor node as well. This is helpful in debugging.
We can consider adding metrics for each stream partition timestamp.
This is helpful in debugging.
## Besides the latency metrics, we can also add throughput latency in terms of source records consumed.
More discussions here https://issues.apache.org/jira/browse/KAFKA-3715, KIP-104, KIP-105
Author: Eno Thereska <eno@confluent.io>
Author: Aarti Gupta <aartiguptaa@gmail.com>
Reviewers: Greg Fodor, Ismael Juma, Damian Guy, Guozhang Wang
Closes#1446 from aartigupta/trunk
The client should send older versions of requests to the broker if necessary.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2264 from cmccabe/KAFKA-4507
dguy guozhangwang This is a new PR for KAFKA-4060.
Author: Hojjat Jafarpour <hojjat@Hojjat-Jafarpours-MBP.local>
Author: Hojjat Jafarpour <hojjat@HojjatJpoursMBP.attlocal.net>
Reviewers: Damian Guy, Matthias J. Sax, Isamel Juma, Guozhang Wang
Closes#1884 from hjafarpour/KAFKA-4060-Remove-ZkClient-dependency-in-Kafka-Streams-new
Make appropriate methods contravariant in key and value types.
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#2205 from xvrl/streams-contravariance
Rename `SessionStore.findSessionsToMerge` to `findSessions`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#2339 from dguy/minor-findsession-rename
Closed the last PR I made for this because I accidentally borked it with my other PR. Small error; I figure this is from copy-pasting the above doc
Author: Nikki Thean <nthean@etsy.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2336 from nixsticks/trunk
When creating the source node in TopologyBuilder, we need to decorate its input topics if they are inner (i.e. repartition) topics with the prefix.
Also did some minor cleanup in the printing function for better visualization in debugging.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska, Damian Guy, Eno Thereska, Jun Rao
Closes#2320 from guozhangwang/KMinor-source-topic-fix
Remove use of TestTimestampExtractor as it causes the logs to roll and segments get deleted.
Remove the wcnt example as it is dependent on the TestTimestampExtractor - windowed counting is covered elsewhere.
Change all aggregate operations to use TimeWindow as use of UnlimitedWindow was causing logs to roll and segments being deleted.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang, Eno Thereska
Closes#2319 from dguy/smoke-test
Author: yaojuncn <yaojuncn@users.noreply.github.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Konstantin <konstantin@tubemogul.com>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2128 from yaojuncn/KAFKA-4402-client-producer-round-robin-fix
Add support for SessionWindows based on design detailed in https://cwiki.apache.org/confluence/display/KAFKA/KIP-94+Session+Windows.
This includes refactoring of the RocksDBWindowStore such that functionality common with the RocksDBSessionStore isn't duplicated.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#2166 from dguy/kafka-3452-session-merge
Adds a bunch of tests to unit tests to the assignment command.
Moves the Rack aware test into its own class as it makes use of ZooKeeperTestHarness and slows everything else down.
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1950 from benstopford/os-rebalance-extra-unit-testing
Otherwise in this test the sink task goes through the pause/resume cycle with 0 assigned partitions, since the default metadata refresh interval is quite long
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2313 from shikhar/kafka-4575
Shut down the group coordinator before shutting down the log manager to
ensure that any delayed operations are completed before the logs are closed.
Author: steve <sniemitz@twitter.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2311 from steveniemitz/KAFKA-4523