guozhangwang
* added ```PartitionGrouper``` (abstract class)
* This class is responsible for grouping partitions. Each group forms a task.
* Users may implement this class for custom grouping.
* added ```DefaultPartitionGrouper```
* our default implementation of ```PartitionGrouper```
* added ```KafkaStreamingPartitionAssignor```
* We always use this as ```PartitionAssignor``` of stream consumers.
* Actual grouping is delegated to ```PartitionGrouper```.
* ```TopologyBuilder```
* added ```topicGroups()```
* This returns groups of related topics according to the topology
* added ```copartitionSources(sourceNodes...)```
* This is used by DSL layer. It asserts the specified source nodes must be copartitioned.
* added ```copartitionGroups()```
* This returns groups of copartitioned topics
* KStream layer
* keep track of source nodes to determine copartition sources when steams are joined
* source nodes are set to null when partitioning property is not preserved (ex. ```map()```, ```transform()```), and this indicates the stream is no longer joinable
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#353 from ymatsuda/grouping
This adds coordination between DistributedHerders using the generalized consumer
support, allowing automatic balancing of connectors and tasks across workers. A
few pieces that require interaction between workers (resolving config
inconsistencies, forwarding of configuration changes to the leader worker) are
incomplete because they require REST API support to implement properly.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Jason Gustafson, Gwen Shapira
Closes#321 from ewencp/kafka-2371-distributed-herder
This pull request adds a configuration parameter and a migration tool. It is also based on pull request #303, which should go in first.
Author: flavio junqueira <fpj@apache.org>
Author: Flavio Junqueira <fpj@apache.org>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#313 from fpj/KAFKA-2641
Changes in this patch are:
1. ClientIdConfigHandler now passes through the config changes to the quota manager.
2. Removed static KafkaConfigs for quota overrides. These are no longer needed since we can override configs through ZooKeeper.
3. Added testcases to verify that the config changes are propogated from ZK (written using AdminTools) to the actual Metric objects.
Author: Aditya Auradkar <aauradka@aauradka-mn1.(none)>
Author: Aditya Auradkar <aauradka@aauradka-mn1.linkedin.biz>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#298 from auradkar/K-2209
The test required a specific sequence of events for each Consumer.poll() call,
but the MockConsumer.waitForPollThen() method could not guarantee that,
resulting in race conditions. Add support for scheduling sequences of events
even when running in multi-threaded environments.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Guozhang Wang
Closes#333 from ewencp/kafka-2667-kafka-based-log-transient-error
This fix applies to three JIRAs, since they are all connected.
KAFKA-2459Connection backoff/blackout period should start when a connection is disconnected, not when the connection attempt was initiated
Backoff when connection is disconnected
KAFKA-2615Poll() method is broken wrt time
Added Time through the NetworkClient API. Minimal change.
KAFKA-1843Metadata fetch/refresh in new producer should handle all node connection states gracefully
I’ve partially addressed this for a specific failure case in the JIRA.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ewen Cheslack-Postava, Jason Gustafson, Ismael Juma, Guozhang Wang
Closes#290 from enothereska/trunk
This PR implements SASL/Kerberos which was originally submitted by harshach as https://github.com/apache/kafka/pull/191.
I've been submitting PRs to Harsha's branch with fixes and improvements and he has integrated all, but the most recent one. I'm creating this PR so that the Jenkins can run the tests on the branch (they pass locally).
Author: Ismael Juma <ismael@juma.me.uk>
Author: Sriharsha Chintalapani <harsha@hortonworks.com>
Author: Harsha <harshach@users.noreply.github.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>, Parth Brahmbhatt <brahmbhatt.parth@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#334 from ijuma/KAFKA-1686-V1
Removed default hardcoded keystore and truststore in /tmp so that default JVM keystore/truststore may be used when keystore/truststore is not specified in Kafka server or client properties
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#312 from rajinisivaram/KAFKA-2656
I've split the work of KAFKA-1695 because this refactoring touches a large number of files. Most of the changes are trivial, but I feel it will be easier to review this way.
This pull request includes the one Parth-Brahmbhatt started to address KAFKA-1695.
Author: flavio junqueira <fpj@apache.org>
Author: Flavio Junqueira <fpj@apache.org>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#303 from fpj/KAFKA-2639
Let's say every consumer in a group has session timeout s. Currently, if a consumer leaves the group, the worst case time to stabilize the group is 2s (s to detect the consumer failure + s for the rebalance window). If a consumer instead can declare they are leaving the group, the worst case time to stabilize the group would just be the s associated with the rebalance window.
This is a low priority optimization!
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson, Guozhang Wang
Closes#103 from onurkaraman/leave-group
See here for more discussion: https://issues.apache.org/jira/browse/KAFKA-2419
Basically, the fix involves adding a param to Metrics to indicate if it is capable of metric cleanup or not.
Author: Aditya Auradkar <aauradka@aauradka-mn1.linkedin.biz>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#323 from auradkar/KAFKA-2419-fix
Trivial fix to get rid of unused statements in kafkaProducer.
Author: Mayuresh Gharat <mgharat@mgharat-ld1.linkedin.biz>
Reviewers: Edward Ribeiro, Guozhang Wang
Closes#320 from MayureshGharat/kafka-2120-followup
This also adds some other needed infrastructure for distributed Copycat, most
importantly the DistributedHerder, and refactors some code for handling
Kafka-backed logs into KafkaBasedLog since this is shared betweeen offset and
config storage.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Gwen Shapira, James Cheng
Closes#241 from ewencp/kafka-2372-copycat-distributed-config
Enables Cipher suite setting. Code was previously reviewed by ijuma, harshach. Moving to an independent PR.
Author: benstopford <benstopford@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#301 from benstopford/cipher-switch
This is a followup ticket from KAFKA-2084 to improve the windowSize calculation in Quotas. I've made the following changes:
1. Added a windowSize function on Rate
2. Calling Rate.windowSize in ClientQuotaManager to return the exact window size to use when computing the delay time.
3. Changed the window size calculation subtly. The current calculation had a bug wherein, it used the number of elapsed seconds from the "lastWindowSeconds" of the most recent Sample object. However, the lastWindowSeconds is the time when the sample is created.. this causes an issue because it implies that the current window elapsed time is always "0" when the sample is created. This is incorrect as demonstrated in a testcase I added in MetricsTest. I've fixed the calculation to count the elapsed time from the "oldest" sample in the set since that gives us an accurate value of the exact amount of time elapsed
Author: Aditya Auradkar <aauradkar@linkedin.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy.w@gmail.com>
Closes#213 from auradkar/K-2443
* Call `ConnectionQuotas.decr` when calling `Selector.close` and when disconnections happen.
* Expand `SocketServerTest` to test for this and to close sockets.
* Refactor and clean-up `SocketServer` and `Acceptor` to make the code easier to understand.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#288 from ijuma/kafka-2614-connection-count-not-updated
Add sanity test in kafkaConsumer for the timeouts. This is a followup ticket for Kafka-2120.
Author: Mayuresh Gharat <mgharat@mgharat-ld1.linkedin.biz>
Reviewers: Dong Lin, Ismael Juma, Guozhang Wang
Closes#282 from MayureshGharat/Kafka-2428
As discussed in KAFKA-2419 - I've added a time based sensor retention config to Sensor. Sensors that have not been "recorded" for 'n' seconds are eligible for expiration.
In addition to the time based retention, I've also altered several tests to close the Metrics and scheduler objects since they can cause leaks while running tests. This causes TestUtils.verifyNonDaemonThreadStatus to fail.
Author: Aditya Auradkar <aauradka@aauradka-mn1.linkedin.biz>
Author: Aditya Auradkar <aauradka@aauradka-mn1.(none)>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy.w@gmail.com>
Closes#233 from auradkar/K-2419
Unit tests which mock buffer overflow and underflow in the SSL transport layer and fixes for the couple of issues in buffer overflow handling described in the JIRA.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Sriharsha Chintalapani <schintalapani@hortonworks.com>, Jun Rao <junrao@gmail.com>
Closes#205 from rajinisivaram/KAFKA-2534
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava, Jason Gustafson, Guozhang Wang
Closes#272 from ijuma/kafka-2640-remove-complete-all-poll-timeout
Remove state storage upon unclean shutdown and fix streaming metrics used for local state.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Edward Ribeiro, Yasuhiro Matsuda, Jun Rao
Closes#265 from guozhangwang/K2591
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Sriharsha Chintalapani <schintalapani@hortonworks.com>, Ben Stopford <benstopford@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#273 from ijuma/kafka-2517-ssl-zero-copy-regression
It compares upper bound with itself.
Author: Edward Ribeiro <edward.ribeiro@gmail.com>
Reviewers: Aditya Auradkar, Ismael Juma, Guozhang Wang
Closes#182 from eribeiro/equals-bug
This work has been contributed by Jesse Anderson, Randall Hauch, Yasuhiro Matsuda and Guozhang Wang. The detailed design can be found in https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client.
Author: Guozhang Wang <wangguoz@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro.matsuda@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Author: ymatsuda <yasuhiro.matsuda@gmail.com>
Author: Randall Hauch <rhauch@gmail.com>
Author: Jesse Anderson <jesse@smokinghand.com>
Author: Ismael Juma <ismael@juma.me.uk>
Author: Jesse Anderson <eljefe6a@gmail.com>
Reviewers: Ismael Juma, Randall Hauch, Edward Ribeiro, Gwen Shapira, Jun Rao, Jay Kreps, Yasuhiro Matsuda, Guozhang Wang
Closes#130 from guozhangwang/streaming