... by
a) merging some for startup/shutdown efficiency.
b) use independent state dirs.
c) remove some tests that are covered elsewhere
guozhangwang ewencp - tests are running much quicker now, i.e, down to about 1 minute on my laptop (from about 2 - 3 minutes). There were some issues with state-dirs in some of the integration tests that was causing the shutdown of the streams apps to take a long time.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1525 from dguy/integration-tests
And not the containing struct's default value.
The contribution is my original work and that I license the work to the project under the project's open source license.
ewencp
Author: Rollulus <roelboel@xs4all.nl>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1528 from rollulus/kafka-3864
We had mentioned this step in the performance impact section in the middle of a long paragraph, which made it easy to miss. I also tweaked the reason for setting `log.message.format.version` as it could be misinterpreted previously.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1514 from ijuma/tweak-upgrade-notes
From the kafka-dev mailing list discussion: [[DISCUSS] scalability limits in the coordinator](http://mail-archives.apache.org/mod_mbox/kafka-dev/201605.mbox/%3CCAMQuQBZDdtAdhcgL6h4SmTgO83UQt4s72gc03B3VFghnME3FTAmail.gmail.com%3E)
There's a scalability limit on the new consumer / coordinator regarding the amount of group metadata we can fit into one message. This restricts a combination of consumer group size, topic subscription sizes, topic assignment sizes, and any remaining member metadata.
Under more strenuous use cases like mirroring clusters with thousands of topics, this limitation can be reached even after applying gzip to the __consumer_offsets topic.
Various options were proposed in the discussion:
1. Config change: reduce the number of consumers in the group. This isn't always a realistic answer in more strenuous use cases like MirrorMaker clusters or for auditing.
2. Config change: split the group into smaller groups which together will get full coverage of the topics. This gives each group member a smaller subscription.(ex: g1 has topics starting with a-m while g2 has topics starting with n-z). This would be operationally painful to manage.
3. Config change: split the topics among members of the group. Again this gives each group member a smaller subscription. This would also be operationally painful to manage.
4. Config change: bump up KafkaConfig.messageMaxBytes (a topic-level config) and KafkaConfig.replicaFetchMaxBytes (a broker-level config). Applying messageMaxBytes to just the __consumer_offsets topic seems relatively harmless, but bumping up the broker-level replicaFetchMaxBytes would probably need more attention.
5. Config change: try different compression codecs. Based on 2 minutes of googling, it seems like lz4 and snappy are faster than gzip but have worse compression, so this probably won't help.
6. Implementation change: support sending the regex over the wire instead of the fully expanded topic subscriptions. I think people said in the past that different languages have subtle differences in regex, so this doesn't play nicely with cross-language groups.
7. Implementation change: maybe we can reverse the mapping? Instead of mapping from member to subscriptions, we can map a subscription to a list of members.
8. Implementation change: maybe we can try to break apart the subscription and assignments from the same SyncGroupRequest into multiple records? They can still go to the same message set and get appended together. This way the limit become the segment size, which shouldn't be a problem. This can be tricky to get right because we're currently keying these messages on the group, so I think records from the same rebalance might accidentally compact one another, but my understanding of compaction isn't that great.
9. Implementation change: try to apply some tricks on the assignment serialization to make it smaller.
10. Config and Implementation change: bump up the __consumer_offsets topic messageMaxBytes and (from Jun Rao) fix how we deal with the case when a message is larger than the fetch size. Today, if the fetch size is smaller than the fetch size, the consumer will get stuck. Instead, we can simply return the full message if it's larger than the fetch size w/o requiring the consumer to manually adjust the fetch size.
11. Config and Implementation change: same as above but only apply the special fetch logic when fetching from internal topics
This PR provides an implementation of option 11.
That being said, I'm not very happy with this approach as it essentially doesn't honor the "replica.fetch.max.bytes" config. Better alternatives are definitely welcome!
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1484 from onurkaraman/KAFKA-3810
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1500 from vahidhashemian/typo07/fix_javadoc_typos_consumerrebalancelistener
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax
Closes#1520 from guozhangwang/KHotfix-iter-hasNext-window-value-getter
- Check if DB is null before flushing or closing. In some cases, a state store is closed twice. This happens in `StreamTask.close()` where both `node.close()` and `super.close` (in `ProcessorManager`) are called in a sequence. If the user's processor defines a `close` that closes the underlying state store, then the second close will be redundant.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Andrés Gómez, Ismael Juma, Guozhang Wang
Closes#1485 from enothereska/KAFKA-3805-locks
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1513 from hachikuji/followup-for-kafka-2720
guozhangwang enothereska mjsax miguno
If you get a chance can you please take a look at this. I've done the repartitioning in the join, but it results in 2 internal topics for each join. This seems like overkill as sometimes we wouldn't need to repartition at all, others just 1 topic, and then sometimes both, but I'm not sure how we can know that.
I'd also need to implement something similar for leftJoin, but again, i'd like to see if i'm heading down the right path or if anyone has any other bright ideas.
For reference - https://github.com/apache/kafka/pull/1453 - the previous PR
Thanks for taking the time and looking forward to getting some welcome advice :-)
Author: Damian Guy <damian.guy@gmail.com>
Author: Damian Guy <damian@continuum.local>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1472 from dguy/KAFKA-3561
Follow up on KAFKA-724 (#1469) to allow OS socket buffer sizes auto tuning for both the broker and the clients.
Author: Sebastien Launay <sebastien@opendns.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1507 from slaunay/enhancement/os-socket-buffer-size-tuning-for-clients
This PR is the follow on to the closed PR #1410.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1477 from bbejeck/KAFKA-3443_streams_support_for_regex_sources
See https://issues.apache.org/jira/browse/KAFKA-3753
This contribution is my original work and I license the work to the project under the project's open source license.
cc guozhangwang kichristensen ijuma
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1486 from jklukas/kvstore-size
guozhangwang mjsax enothereska
Currently, Kafka Streams does not have a util to get access to the sequence number added to the key of windows state store changelogs. I'm interested in exposing it so the the contents of a changelog topic can be 1) inspected for debugging purposes and 2) saved to text file and loaded from text file
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1501 from theduderog/expose-seq-num
Only log the client and server principals, which is what ZooKeeper does after ZOOKEEPER-2405.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#1498 from ijuma/kafka-3830-get-tgt-debug-confidential
Adding an error logging message in Log.loadSegments() in the case when an index file corresponding to a log file exists but an exception is thrown.
Signed-off-by: Ishita Mandhan <imandhaus.ibm.com>
Author: Ishita Mandhan <imandha@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1480 from imandhan/KAFKA-3762
The `partition` argument is not marked as required, and has a default of `0`, according to the tool's help message. However, if `partition` is not provided the command returns with `Missing required argument "[partition]"`. This patch is to fix the required arguments of the tool by removing `partition` from them.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1495 from vahidhashemian/minor/simple_consumer_shell_update_required_args
Hi all,
This is my first commit to Kafka.
"msec / 1000" turns into sec, isn't it?
I just have fixed a variable name.
granders
Author: kota-uchida <kota-uchida@cybozu.co.jp>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1482 from uchan-nos/elapsed-ms
See https://issues.apache.org/jira/browse/KAFKA-3711
I've tested locally that this change does indeed resolve the warning I mention in the ticket:
```
org.apache.kafka.clients.consumer.ConsumerConfig: The configuration metric.dropwizard.registry = kafka-metrics was supplied but isn't a known config.
```
where `metric.dropwizard.registry` is a configuration value defined in a custom `MetricReporter` (https://github.com/SimpleFinance/kafka-dropwizard-reporter).
With this change, the above warning no longer appears, as ewencp predicted.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1479 from jklukas/abstractconfig-originals
ijuma harshach edoardocomar Can you please review the changes.
edoardocomar I have addressed your comment of extra space.
Author: Bharat Viswanadham <bharatv@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1474 from bharatviswa504/Kafka-3748
- replace `System.exit(1)` with a regular `return` in order to release the latch blocking the shutdown hook thread from shutting down the JVM
- provide `PrintStream` to the `process` method in order to ease unit testing
Author: Sebastien Launay <sebastien@opendns.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1185 from slaunay/bugfix/KAFKA-3501-console-consumer-hangs
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1476 from ijuma/kafka-3781-exception-name-npe
- ZkClient is used for conditional path deletion and wraps `KeeperException.BadVersionException` into `ZkBadVersionException`
- add unit test to `SimpleAclAuthorizerTest` to reproduce the issue and catch potential future regression
Author: Sebastien Launay <sebastien@opendns.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1461 from slaunay/bugfix/KAFKA-3783-zk-conditional-delete-path
If socket.receive.buffer.bytes/socket.send.buffer.bytes are set to -1, use the OS defaults.
Author: Joshi <rekhajoshm@gmail.com>
Author: Rekha Joshi <rekhajoshm@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1469 from rekhajoshm/KAFKA-724-rebased
If no messages are sent to a topic during the last refresh interval or if UNKNOWN_TOPIC_OR_PARTITION error is received, remove the topic from the metadata list. Topics are added to the list on the next attempt to send a message to the topic.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Author: rsivaram <rsivaram@uk.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#645 from rajinisivaram/KAFKA-2948
- Used flatMap instead of map and flatten
- Use isEmpty, NonEmpty, isDefined as appropriate
- Used head, keys and keySet where appropriate
- Used contains, diff and find where appropriate
- Removed redundant val modifier for case class constructor
- toString has no parameters, no side effect hence without () consistent usage
- Removed unnecessary return , parentheses and semi colons.
Author: Joshi <rekhajoshm@gmail.com>
Author: Rekha Joshi <rekhajoshm@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1451 from rekhajoshm/KAFKA-3771
Included a couple of clean-ups: removed unused variable and the instantiated `KafkaProducer` is now closed.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#1470 from ijuma/move-console-producer-test-to-unit-folder
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1471 from ijuma/fix-leaking-producers-in-plaintext-producer-send-test
The timestamp of messages consumed by mirror maker is not preserved after sending to target cluster. The correct behavior is to keep create timestamp the same in both source and target clusters.
Author: Tao Xiao <xiaotao183@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1466 from xiaotao183/KAFKA-3787
The host and port entries under /brokers/ids/<bid> gets filled only for PLAINTEXT security protocol. For other protocols the host is null and the actual endpoint is under "endpoints". This causes NPE when running the consumer group and offset checker scripts in a kerberized env. By always reading the host and port values from the "endpoint", a more meaningful exception would be thrown rather than a NPE.
Author: Arun Mahadevan <aiyer@hortonworks.com>
Reviewers: Sriharsha Chintalapani <schintalapani@hortonworks.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1301 from arunmahadevan/cg_kerb_fix
Set OffsetsTopicReplicationFactorProp to 3 like MinInSyncReplicasProp Else a consumer was able to consume via assign but not via subscribe, so the testProduceAndConsume is now duplicated to check both paths
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1425 from edoardocomar/KAFKA-3728
- Fixed the logic calculating the windows that are affected by a new …event in the case of hopping windows and a small overlap.
- Added a unit test that tests for the issue
Author: Tom Rybak <trybak@gmail.com>
Reviewers: Michael G. Noll, Matthias J. Sax, Guozhang Wang
Closes#1462 from trybak/bugfix/KAFKA-3784-TimeWindows#windowsFor-false-positives
Currently javadoc doesn't specify charset.
This pull reqeust will set this to UTF-8.
Author: Sasaki Toru <sasakitoa@nttdata.co.jp>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1413 from sasakitoa/javadoc_garbled
I found both issues while investigating the issue described in PR #1425.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Sriharsha Chintalapani <schintalapani@hortonworks.com>, Jun Rao <junrao@gmail.com>
Closes#1455 from ijuma/fix-integration-test-harness-and-zk-test-harness