This contribution is my original work and I license the work to the Kafka project under the project's open source license.
cc guozhangwang miguno ymatsuda
Author: Jeff Klukas <jeff@klukas.net>
Reviewers: Michael G. Noll <michael@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#1270 from jklukas/streams-doc-fix
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma, Josh Gruenberg, Michael G. Noll, Ewen Cheslack-Postava
Closes#1229 from guozhangwang/K3499
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Grant Henke <granthenke@gmail.com>, Ashish Singh <asingh@cloudera.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1254 from hachikuji/KAFKA-3602
with help from enothereska :)
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Jun Rao <junrao@apache.org>, Eno Thereska <eno.thereska@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#956 from benstopford/KAFKA-3270-ReassignPartitionsCommand-Tests
Kafka Streams seems to hold file handles on the `.lock` files for the state dirs, resulting in an explosion of filehandles over time. Running `lsof` shows the number of open filehandles on the `.lock` file increasing rapidly over time. In a separate test project, I reproduced the issue and determined that in order for the filehandle to be relinquished the `FileChannel` instance must be properly closed. Applying this patch seems to resolve the issue in my job.
Author: Greg Fodor <gfodor@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1267 from gfodor/bug/state-lock-filehandle-leak
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Michael G. Noll <michael@confluent.io>
Closes#1255 from dguy/kgroupedtable-count-test
ewencp Can you take a quick look?
Author: Liquan Pei <liquanpei@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1252 from Ishiihara/pre-list-connectors
This exception is occurring when producer is trying to append a record to a Re-enqueued record batch in the accumulator. We should not allow to add a record to Re-enqueued record batch. This is due a bug in MemoryRecords.java/hasRoomFor() method. After calling MemoryRecords.close() method, hasRoomFor() method should return false.
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Ismael Juma, Grant Henke, Guozhang Wang
Closes#1249 from omkreddy/KAFKA-3594
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1246 from guozhangwang/K3589
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1231 from mjsax/kafka-3337-extact-key-selector-from-agg
For enums and other constant strings, use locale independent case conversions to enable comparisons to work regardless of the default locale.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Manikumar Reddy, Ismael Juma, Guozhang Wang, Gwen Shapira
Closes#1220 from rajinisivaram/KAFKA-3548
RollingBounceTest is a system test that cannot be run reliably in unit tests and ReplicationTest is a superset of the
functionality: in addition to verifying that bouncing leaders eventually results in a new leader, ReplicationTest also
validates that data continues to be produced and consumed.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Gwen Shapira
Closes#1242 from ewencp/minor-remove-rolling-bounce-integration-test
This patch fixes all occurances of two consecutive 'the's in the code comments.
Author: Ishita Mandhan (imandhaus.ibm.com)
Author: Ishita Mandhan <imandha@us.ibm.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1240 from imandhan/typofixes
ewencp gwenshap Docs. I also tried to clean up some typos. However, it seems that the we don't have two words without space in between in the source yet they showed up as no space in between in the generated doc.
Author: Liquan Pei <liquanpei@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1227 from Ishiihara/config-doc
* Use a fixed `Random` seed in `EndToEndLatency.scala` for determinism
* Add `compression_type` to and remove `consumer_fetch_max_wait` from `end_to_end_latency.py`. The latter was never used.
* Tweak logging of `end_to_end_latency.py` to be similar to `consumer_performance.py`.
* Add `compression_type` to `benchmark_test.py` methods and add `snappy` to `matrix` annotation
* Use randomly generated bytes from a restricted range for `ProducerPerformance` payload. This is a simple fix for now. It can be improved in the PR for KAFKA-3554.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1225 from ijuma/kafka-3558-add-compression_type-benchmark_test.py
… With KStream the method selectKey was added to enable getting a key from values before perfoming aggregation-by-key operations on original streams that have null keys.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1222 from bbejeck/KAFKA-3430_allow_users_to_set_key_KTable_toStream
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1217 from granthenke/close-consumers
* The hope is that RocksDb 4.4.1 is more stable than 4.1.0 (occasional segfaults) and 4.2.0 (very frequent segfaults), release notes for 4.4.1: https://www.facebook.com/groups/rocksdb.dev/permalink/925995520832296/
* slf4j 1.7.21 includes thread-safety fixes: http://www.slf4j.org/news.html
* snappy 1.1.2.4 includes performance improvements requested by Spark, which apply to our usage: https://github.com/xerial/snappy-java/blob/master/Milestone.md
I ran the stream tests several times and they passed every time while 4.2.0 segfaulted every time.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1219 from ijuma/kafka-3557-update-rocks-db-4.4.1-snappy-slf4j
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#1206 from hachikuji/KAFKA-3470
Author: Ismael Juma <ismael@juma.me.uk>
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1173 from ijuma/kafka-3490-multiple-version-support-perf-tests
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#1203 from enothereska/KAFKA-3504-logcompaction
This PR fixes 8 typos in HTML files of `docs` module. I wrote explicitly here since Github sometimes does not highlight the corrections on long lines correctly.
- docs/api.html: compatability => compatibility
- docs/connect.html: simultaneoulsy => simultaneously
- docs/implementation.html: LATIEST_TIME => LATEST_TIME, nPartions => nPartitions
- docs/migration.html: Decomission => Decommission
- docs/ops.html: stoping => stopping, ConumserGroupCommand => ConsumerGroupCommand, youre => you're
Author: Dongjoon Hyun <dongjoon@apache.org>
Reviewers: Ismael Juma
Closes#1138 from dongjoon-hyun/KAFKA-3461
All dependencies on hadoop were removed with MiniKDC. This removes the left over version entry.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma
Closes#1214 from granthenke/remove-hadoop
While playing with client got the next exception:
```java
java.lang.IllegalArgumentException: Invalid partition given with record: 1 is not in the range [0...1].
```
It's obviously incorrect, so I've fixed it.
Author: Igor Stepanov <igor.stepanov@keystonett.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1210 from stepio/trunk
Addresses comments from previous PR [#1187]
Changed print and writeAsText method return signature to void
Flush System.out on close
Changed IllegalStateException to TopologyBuilderException
Updated MockProcessorContext.topic method to return a String
Renamed KStreamPrinter to KeyValuePrinter
Updated the printing of null keys to 'null' to match ConsoleConsumer
Updated JavaDoc stating need to override toString
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Dan Norwood, Guozhang Wang
Closes#1209 from bbejeck/KAFKA-3338_Adding_print/writeAsText_to_Streams_DSL
This should make Log.read act the same when startOffset is larger than maxOffset as it would if startOffset was larger than logEndOffset. The current behavior can result in an IllegalArgumentException from LogSegment if a consumer attempts to fetch an offset above the high watermark which is present in the leader's log. It seems more correct if Log.read presents the view of the log to consumers as if it simply ended at maxOffset (high watermark).
I've tried to describe an example scenario of this happening here https://issues.apache.org/jira/browse/KAFKA-725?focusedCommentId=15221673&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15221673
I'm not sure I understand why ReplicaManager sets maxOffset to the high watermark, and not high watermark + 1. Isn't the high watermark the last committed message, and readable by consumers?
Tests passed for me locally on second try, seems like it just hit a flaky test.
Author: Stig Rohde Døssing <sdo@it-minds.dk>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1178 from srdo/KAFKA-725
miguno guozhangwang please have a look if you can.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Michael G. Noll <michael@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#1193 from enothereska/kafka-3512-ForEach
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Anna Povzner <anna@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1190 from guozhangwang/K3505
This PR: https://github.com/apache/kafka/pull/958 fixed the use of prop_file in the situation when we have multiple producers (before, every producer will add to the config). However, it assumes that self.prop_file is initially "". This is correct for all existing tests, but it precludes us from extending verifiable producer and adding more properties to the producer config (same as console consumer).
This is a small PR to change the behavior to the original, but also make verifiable producer use prop_file method to be consistent with console consumer.
Also few more fixes to verifiable producer came up during the review:
-- fixed each_produced_at_least() method
-- more straightforward use of compression types
granders please review.
Author: Anna Povzner <anna@confluent.io>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1192 from apovzner/fix_verifiable_producer
Fail unsent requests only when returning from KafkaConsumer.poll().
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1183 from rajinisivaram/KAFKA-3488
Allows the the maximum retires when writing to zookeeper to be overridden in tests and sets the value to Int.MaxValue to avoid transient failure.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1156 from granthenke/transient-acl-test