Add a note to `KafkaStreams.metadataForKey(String, K, Serializer<K>)` to point out that in the case of a Window Store the Serializer should still be the record key serializer and not a window serializer
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Xavier Léauté, Guozhang Wang
Closes#2532 from dguy/minor-docs
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2475 from vahidhashemian/minor/use_explicit_Errors_type_when_possible
1. Added an architecture section.
2. Added a configuration / execution sub-section to developer guide.
Minor tweaks and a bunch of missing fixes from `kafka-site` repo.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Derrick Or <derrickor@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2488 from guozhangwang/KMinor-streams-docs-second-pass
Author: Grant Henke <ghenke@cloudera.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2246 from granthenke/truststore-password
Generate core project with correct source folders. In addition
set output folders same as command line build. Don't generate
unnecessary projects.
Author: Dhwani Katagade <dhwani_katagade@persistent.com>
Reviewers: Edoardo Comar <ecomar@uk.ibm.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2382 from dhwanikatagade/gradle_eclipse_plugin_path_fix
This PR fixes a blocker issue, where the streams client code cannot talk to the controller. It also enables a system test that was previously failing.
This PR is for trunk only. A separate PR with just the fix (but not the tests) will be created for 0.10.2.
Author: Eno Thereska <eno@confluent.io>
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy, Ismael Juma, Matthias J. Sax, Guozhang Wang
Closes#2522 from enothereska/KAFKA-4716-metadata
This caused the bounce and smoke tests to fail on trunk.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2524 from enothereska/hotfix-tests
GroupCoordinatorMetrics currently sets up join-time-max and sync-time-max incorrectly as a "new Avg()" MeasurableStat instead of "new Max()"
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2520 from onurkaraman/KAFKA-4749
Author: Eno Thereska <eno.thereska@gmail.com>
Author: Eno Thereska <eno@confluent.io>
Author: Ubuntu <ubuntu@ip-172-31-22-146.us-west-2.compute.internal>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2478 from enothereska/minor-benchmark-args
Provide test coverage for exception paths in: `schedule()`, `closeTopology()`, and `punctuate()`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2451 from dguy/kafka-4640
Author: Satish Duggana <sduggana@hortonworks.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2509 from satishd/buffer-cleanup
Added general explanation of the tool and what it does. Also added few details to the arguments.
Author: Gwen Shapira <cshapi@gmail.com>
Reviewers: Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#2503 from gwenshap/KAFKA-4733
This change is in response to [KAFKA-4725](https://issues.apache.org/jira/browse/KAFKA-4725).
When a produce request is received, if the user/client is exceeding their produce quota, the response will be delayed until the quota is refilled appropriately.
Unfortunately, the request body is still referenced in the callback which in turn leaks the messages contained within the request.
This change allows the `KafkaApis` method to take ownership of the request body from the `RequestChannel.Request` object.
I am not sure whether this breaks other invariants which are assumed within other parts of Kafka.
Author: Tim Carey-Smith <tim@spork.in>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#2496 from halorgium/fix-throttled-response-leak
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#2501 from becketqin/KAFKA-4734
1. Update value for queued.max.requests to 500
2. Removed invalid config 'controller.message.queue.size'
3. Removed flush configs including 'log.flush.interval.messages', 'log.flush.interval.ms' and 'log.flush.scheduler.interval.ms'
Author: huxi <huxi@zhenrongbao.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#2490 from amethystic/kafka4727_server_config_doc_update
In https://issues.apache.org/jira/browse/KAFKA-4521 we fixed a potential message reorder bug in MM. However, the patch introduced another bug that can cause deadlock during MM shutdown. The deadlock will happen if zookeeper listener thread call requestAndWaitForCommit() after MirrorMaker thread has already exited loop of consuming and producing messages.
This patch fixes the problem by setting `iter` to `null` in `MirrorMakerOldConsumer.cleanup()`. If zookeeper listener thread calls `requestAndWaitForCommit()` after `cleanup()`, then it will not block waiting for commit notification since `iter == null`. If zookeeper listener thread calls `requestAndWaitForCommit()` before `cleanup()`, then `cleanup()` will call `notifyAll()` to unblock zookeeper listener thread.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>
Closes#2504 from lindong28/KAFKA-4735
Removed readKeyValues() that give UNLIMITED_MESSAGES which will doom to exhaust all wait time, as all its callers actually do provide the expected number of messages.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>
Closes#2507 from guozhangwang/KHotfix-not-use-limited-num-messages
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#2508 from ewencp/minor-streams-compatibility-trunk-dev-branch
OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
topic being deleted
Added integration test which polls the metrics while topics are being
created and deleted
Developed with mimaison
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#2325 from edoardocomar/KAFKA-4441
This resolves an issue in driving tests using the ProcessorTopologyTestDriver when `groupBy()` is invoked downstream of a processor that flags repartitioning.
Ticket: https://issues.apache.org/jira/browse/KAFKA-4461
Discussion: http://search-hadoop.com/m/Kafka/uyzND1wbKeY1Q8nH1
dguy guozhangwang
The contribution is my original work and I license the work to the project under the project's open source license.
Author: Adrian McCague <amccague@gmail.com>
Reviewers: Damian Guy, Guozhang Wang
Closes#2499 from amccague/KAFKA-4461_ProcessorTopologyTestDriver_map_groupbykey
Delay the cleanup of state directories that are not locked and not owned by the current thread such that we only remove the directory if its last modified is < now - cleanupDelayMs.
This also helps to avoid a race between threads on the same instance, where during rebalance, one thread releases the lock on the state directory, and before another thread can take the lock, the cleanup runs and removes the data.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2486 from dguy/KAFKA-4724
With this change, the consumer will be considered initialized in the
ProduceConsumeValidate tests once its partitions have been assigned.
Author: Apurva Mehta <apurva.1618@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2347 from apurvam/KAFKA-4588-fix-race-between-producer-consumer-start
This is a refactoring follow-up of https://github.com/apache/kafka/pull/2166. Main refactoring changes:
1. Extract `InMemoryKeyValueStore` out of `InMemoryKeyValueStoreSupplier` and remove its duplicates in test package.
2. Add two abstract classes `AbstractKeyValueIterator` and `AbstractKeyValueStore` to collapse common functional logics.
3. Added specialized `BytesXXStore` to accommodate cases where key value types are Bytes / byte[] so that we can save calling the dummy serdes.
4. Make the key type in `ThreadCache` from byte[] to Bytes, as SessionStore / WindowStore's result serialized bytes are in the form of Bytes anyways, so that we can save unnecessary `Bytes.get()` and `Bytes.wrap(bytes)`.
Each of these should arguably be a separate PR and I apologize for the mess, this is because this branch was extracted from a rather large diff that has multiple refactoring mingled together and dguy and myself have already put lots of efforts to break it down to a few separate PRs, and this is the only left-over work. Such PR won't happen in the future.
Ping dguy enothereska mjsax for reviews
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Jun Rao
Closes#2333 from guozhangwang/K3452-followup-state-store-refactor
Exception paths in `register()`, `topic()`, `partition()`, `offset()`, and `timestamp()`, were not covered by any existing tests
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Matthias J. Sax
Closes#2447 from dguy/KAFKA-4646
Add coverage for exception paths in `initialize()`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2452 from dguy/kafka-4649
…tReset
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Guozhang Wang
Closes#2464 from bbejeck/KAFKA-4662_improve_topology_builder_test_coverage
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma, Will Marshall, Damian Guy, Guozhang Wang, Michael G. Noll
Closes#2461 from mjsax/addStreamsUpdateSecton
Author: Maysam Yabandeh <myabandeh@dropbox.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2474 from ijuma/kafka-4039-deadlock-during-shutdown
The issue of transiently having duplicates is due to the bad design of the left join itself: in order to ignore the partial joined results such as `A:null`, it lets the producer to potentially send twice to source stream one and rely on all the following conditions to be true in order to pass the test:
1. `receiveMessages` happen to have fetched all the produced results and have committed offsets.
2. streams app happen to have completed sending all result data.
3. consumer used in `receiveMessages` will complete getting all messages in a single poll().
If any of the above is not true, the test fails.
Fixed this test to add a filter right after left join to filter out partial joined results. Minor cleanup on integration test utils.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy, Ewen Cheslack-Postava
Closes#2485 from guozhangwang/K3896-duplicate-join-results
the toString method prints the topology, but had no tests making sure it works and/or doesn't cause exceptions
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang
Closes#2444 from dguy/KAFKA-4645
Add a test to ensure a `StreamsException` is thrown when an exception other than `StreamsException` is caught
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2450 from dguy/KAFKA-4647
Most of the exception paths weren't covered. Now they are.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Guozhang Wang
Closes#2442 from dguy/KAFKA-4642
Kafka brokers have a config called "offsets.topic.replication.factor" that specify the replication factor for the "__consumer_offsets" topic. The problem is that this config isn't being enforced. If an attempt to create the internal topic is made when there are fewer brokers than "offsets.topic.replication.factor", the topic ends up getting created anyway with the current number of live brokers. The current behavior is pretty surprising when you have clients or tooling running as the cluster is getting setup. Even if your cluster ends up being huge, you'll find out much later that __consumer_offsets was setup with no replication.
The cluster not meeting the "offsets.topic.replication.factor" requirement on the internal topic is another way of saying the cluster isn't fully setup yet.
The right behavior should be for "offsets.topic.replication.factor" to be enforced. Topic creation of the internal topic should fail with GROUP_COORDINATOR_NOT_AVAILABLE until the "offsets.topic.replication.factor" requirement is met. This closely resembles the behavior of regular topic creation when the requested replication factor exceeds the current size of the cluster, as the request fails with error INVALID_REPLICATION_FACTOR.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2177 from onurkaraman/KAFKA-3959
Makes task assignment more sticky by preferring to assign tasks to clients that had previously had the task as active task. If there are no clients with the task previously active, then search for a standby. Finally falling back to the least loaded client.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#2429 from dguy/kafka-4677
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Manikumar reddy O <manikumar.reddy@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2469 from hachikuji/improve-consumer-logging