StreamThread should keep calling consumer.poll() even when no task is assigned. This is necessary to get a task.
guozhangwang
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#373 from ymatsuda/no_task
The branch wasn't rebased after the capitalisation fix for SSL
classes.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#368 from ijuma/fix-kafka-log4j-appender-compiler-error
guozhangwang
* A task id is now a class, ```TaskId```, that has a topic group id and a partition id fields.
* ```TopologyBuilder``` assigns a topic group id to a topic group. Related methods are changed accordingly.
* A state store uses the partition id part of the task id as the change log partition id.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#365 from ymatsuda/task_id
guozhangwang
* added ```PartitionGrouper``` (abstract class)
* This class is responsible for grouping partitions. Each group forms a task.
* Users may implement this class for custom grouping.
* added ```DefaultPartitionGrouper```
* our default implementation of ```PartitionGrouper```
* added ```KafkaStreamingPartitionAssignor```
* We always use this as ```PartitionAssignor``` of stream consumers.
* Actual grouping is delegated to ```PartitionGrouper```.
* ```TopologyBuilder```
* added ```topicGroups()```
* This returns groups of related topics according to the topology
* added ```copartitionSources(sourceNodes...)```
* This is used by DSL layer. It asserts the specified source nodes must be copartitioned.
* added ```copartitionGroups()```
* This returns groups of copartitioned topics
* KStream layer
* keep track of source nodes to determine copartition sources when steams are joined
* source nodes are set to null when partitioning property is not preserved (ex. ```map()```, ```transform()```), and this indicates the stream is no longer joinable
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#353 from ymatsuda/grouping
This adds coordination between DistributedHerders using the generalized consumer
support, allowing automatic balancing of connectors and tasks across workers. A
few pieces that require interaction between workers (resolving config
inconsistencies, forwarding of configuration changes to the leader worker) are
incomplete because they require REST API support to implement properly.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Jason Gustafson, Gwen Shapira
Closes#321 from ewencp/kafka-2371-distributed-herder
This pull request adds a configuration parameter and a migration tool. It is also based on pull request #303, which should go in first.
Author: flavio junqueira <fpj@apache.org>
Author: Flavio Junqueira <fpj@apache.org>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#313 from fpj/KAFKA-2641
Probably happened while resolving conflicts, commit: 86eb74d9236c586af5889fe79f4b9e066c9c2af3
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson
Closes#350 from ijuma/restore-ssl-consumer-test
Changes in this patch are:
1. ClientIdConfigHandler now passes through the config changes to the quota manager.
2. Removed static KafkaConfigs for quota overrides. These are no longer needed since we can override configs through ZooKeeper.
3. Added testcases to verify that the config changes are propogated from ZK (written using AdminTools) to the actual Metric objects.
Author: Aditya Auradkar <aauradka@aauradka-mn1.(none)>
Author: Aditya Auradkar <aauradka@aauradka-mn1.linkedin.biz>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#298 from auradkar/K-2209
The test required a specific sequence of events for each Consumer.poll() call,
but the MockConsumer.waitForPollThen() method could not guarantee that,
resulting in race conditions. Add support for scheduling sequences of events
even when running in multi-threaded environments.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Guozhang Wang
Closes#333 from ewencp/kafka-2667-kafka-based-log-transient-error
This fix applies to three JIRAs, since they are all connected.
KAFKA-2459Connection backoff/blackout period should start when a connection is disconnected, not when the connection attempt was initiated
Backoff when connection is disconnected
KAFKA-2615Poll() method is broken wrt time
Added Time through the NetworkClient API. Minimal change.
KAFKA-1843Metadata fetch/refresh in new producer should handle all node connection states gracefully
I’ve partially addressed this for a specific failure case in the JIRA.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ewen Cheslack-Postava, Jason Gustafson, Ismael Juma, Guozhang Wang
Closes#290 from enothereska/trunk
- Both TopicCommand and ConfigCommand warn if message.max.bytes increases
- Log failures on the broker if replication gets stuck due to an oversized message
- Added blocking call to warning.
Author: benstopford <benstopford@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#322 from benstopford/CPKAFKA-61
This PR implements SASL/Kerberos which was originally submitted by harshach as https://github.com/apache/kafka/pull/191.
I've been submitting PRs to Harsha's branch with fixes and improvements and he has integrated all, but the most recent one. I'm creating this PR so that the Jenkins can run the tests on the branch (they pass locally).
Author: Ismael Juma <ismael@juma.me.uk>
Author: Sriharsha Chintalapani <harsha@hortonworks.com>
Author: Harsha <harshach@users.noreply.github.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>, Parth Brahmbhatt <brahmbhatt.parth@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#334 from ijuma/KAFKA-1686-V1
LogCleanerIntegrationTest calls LogCleaner.awaitCleaned() to wait until cleaner has processed up to given offset. However, existing awaitCleaned() implementation doesn't wait for this. This patch fix the problem.
Author: Dong Lin <lindong@cis.upenn.edu>
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#327 from lindong28/KAFKA-2669
Removed default hardcoded keystore and truststore in /tmp so that default JVM keystore/truststore may be used when keystore/truststore is not specified in Kafka server or client properties
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#312 from rajinisivaram/KAFKA-2656
Before we switched from `BlockingChannel` to `NetworkClient`, we were
always reporting a successful connection due to the fact that
`BlockingChannel.connect` catches and swallows all exceptions. We
are now reporting failures (which is better), but `error` seems too
noisy (as can be seen in our tests).
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#280 from ijuma/reduce-connection-failure-logging-level
I've split the work of KAFKA-1695 because this refactoring touches a large number of files. Most of the changes are trivial, but I feel it will be easier to review this way.
This pull request includes the one Parth-Brahmbhatt started to address KAFKA-1695.
Author: flavio junqueira <fpj@apache.org>
Author: Flavio Junqueira <fpj@apache.org>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#303 from fpj/KAFKA-2639
Let's say every consumer in a group has session timeout s. Currently, if a consumer leaves the group, the worst case time to stabilize the group is 2s (s to detect the consumer failure + s for the rebalance window). If a consumer instead can declare they are leaving the group, the worst case time to stabilize the group would just be the s associated with the rebalance window.
This is a low priority optimization!
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson, Guozhang Wang
Closes#103 from onurkaraman/leave-group
Added a new `KeyValueStore` implementation called `InMemoryLRUCacheStore` that keeps a maximum number of entries in-memory, and as the size exceeds the capacity the least-recently used entry is removed from the store and the backing topic. Also added unit tests for this new store and the existing `InMemoryKeyValueStore` and `RocksDBKeyValueStore` implementations. A new `KeyValueStoreTestDriver` class simplifies all of the other tests, and can be used by other libraries to help test their own custom implementations.
This PR depends upon [KAFKA-2593](https://issues.apache.org/jira/browse/KAFKA-2593) and its PR at https://github.com/apache/kafka/pull/255. Once that PR is merged, I can rebase this PR if desired.
Two issues were uncovered when creating these new unit tests, and both are also addressed as separate (small) commits in this PR:
* The `RocksDBKeyValueStore` initialization was not creating the file system directory if missing.
* `MeteredKeyValueStore` was casting to `ProcessorContextImpl` to access the `RecordCollector`, which prevent using `MeteredKeyValueStore` implementations in tests where something other than `ProcessorContextImpl` was used. The fix was to introduce a `RecordCollector.Supplier` interface to define this `recordCollector()` method, and change `ProcessorContextImpl` and `MockProcessorContext` to both implement this interface. Now, `MeteredKeyValueStore` can cast to the new interface to access the record collector rather than to a single concrete implementation, making it possible to use any and all current stores inside unit tests.
Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Edward Ribeiro, Guozhang Wang
Closes#256 from rhauch/kafka-2594
See here for more discussion: https://issues.apache.org/jira/browse/KAFKA-2419
Basically, the fix involves adding a param to Metrics to indicate if it is capable of metric cleanup or not.
Author: Aditya Auradkar <aauradka@aauradka-mn1.linkedin.biz>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#323 from auradkar/KAFKA-2419-fix
Trivial fix to get rid of unused statements in kafkaProducer.
Author: Mayuresh Gharat <mgharat@mgharat-ld1.linkedin.biz>
Reviewers: Edward Ribeiro, Guozhang Wang
Closes#320 from MayureshGharat/kafka-2120-followup
guozhangwang
StreamTaskTest did not set up a temp directory for each test. This occasionally caused interference between tests through state directory locking.
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#317 from ymatsuda/fix_StreamTaskTest
guozhangwang
This change aims to remove unnecessary ```consumer.poll(0)``` calls.
* ```once``` after some partition is resumed
* whenever the size of the top queue in any task is below ```BUFFERED_RECORDS_PER_PARTITION_CONFIG```
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Reviewers: Guozhang Wang
Closes#315 from ymatsuda/less_poll_zero
This is a minimal revert of some backward incompatible changes made in KAFKA-2205, with the addition of the deprecation logging message.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Gwen Shapira
Closes#305 from granthenke/topic-configs