guozhangwang enothereska please review
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#1565 from dguy/kafka-3912
https://issues.apache.org/jira/browse/KAFKA-3922
KAFKA-3922 add copy-constructor to AbstractStream class
This copy-constructor allow to access protected variables from subclasses.
It should be used to extend KStreamImpl and KTableImpl classes by implementing a decorator pattern.
Author: Florian Hussonnois <florian.hussonnois@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1581 from fhussonnois/KAFKA-3922
Mark all public `TopologyBuilder` methods as synchronized as they can modify data-structures and these methods could be called from multiple threads
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1633 from dguy/kafka-3855
This contribution is my original work, and I license the work to the project under the project's open source license.
Author: Samuel Taylor <staylor@square-root.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1630 from ssaamm/trunk
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Geoff Anderson, Guozhang Wang, Ismael Juma
Closes#1621 from enothereska/simple-benchmark-streams-system-tests
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Damian Guy, Matthias J. Sax, Michael G. Noll, Guozhang Wang
Closes#1526 from enothereska/expose-names-dsl
Author: Wan Wenli <wwl.990@hotmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1612 from swwl1992/ticket-KAFKA-3952-fix-consumer-rebalance-verifier
- Added validity checks for input parameters on subscribe, assign to avoid NPE, and provide an argument exception instead
- Updated behavior for subscription with null collection to be same as when subscription with emptyList.i.e., unsubscribes.
- Added tests on subscription, assign
Author: Rekha Joshi <rekhajoshm@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1601 from rekhajoshm/KAFKA-3905-1
Also move the initialization that restores from changelog to inner stores.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Eno Thereska, Dan Norwood
Closes#1610 from guozhangwang/K3941-avoid-eviction-listener
Full credit for figuring out the cause of these failures goes to hachikuji.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Guozhang Wang, Ismael Juma, Jason Gustafson
Closes#1594 from vahidhashemian/KAFKA-3931
Also made a pass over the streams unit tests, with the following changes:
1. Removed three integration tests as they are already covered by other integration tests.
2. Merged `KGroupedTableImplTest` into `KTableAggregateTest`.
3. Use mocks whenever possible to reduce code duplicates.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1604 from guozhangwang/Kminor-unit-tests-consolidation
Fix timing window in producer by holding onto cluster object while processing send requests so that changes to cluster during metadata refresh don't cause NPE if a topic is deleted.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1478 from rajinisivaram/KAFKA-3562
It was previously only deleting files/folders where the path started with /tmp. Changed it to delete from the value of the System Property `java.io.tmpdir`. Also changed the tests that were creating State dirs under /tmp to just use `TestUtils.tempDirectory(..)`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1600 from dguy/kafka-3942
I've updated the ops documentation with information on using the XFS filesystem, based on LinkedIn's testing (and subsequent switch from EXT4).
I've also added some information to clarify the potential risk to the suggested EXT4 options (again, based on my experience with a multiple broker failure situation).
Author: Todd Palino <tpalino@linkedin.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Dana Powers <dana.powers@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1605 from toddpalino/trunk
Author: Ashish Singh <asingh@cloudera.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1515 from SinghAsDev/KAFKA-3849
Author: Nafer Sanabria <nafr.snabr@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1595 from naferx/minor-typo
Since 0.9.0.1 Configuration parameter log.cleaner.enable is now true by default.
Author: Nihed MBAREK <nihedmm@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1592 from nihed/patch-1
Interval lengths for ConsumerPerformance could sometime be calculated as zero. In such cases, when the bytes read or messages read are also zero a NaN output is returned for mbRead per second or for nMsg per second, whereas zero would be a more appropriate output.
In cases where interval length is zero but there have been data and messages to read, an output of Infinity is returned, as expected.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#788 from vahidhashemian/KAFKA-3111
Also handle Null value in SmokeTestUtil.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>
Closes#1597 from guozhangwang/KHotfix-check-null
Minor changes to check null changes.
Author: Jeyhun Karimov <je.karimov@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1591 from jeyhunkarimov/KAFKA-3836
…int to the console.
Author: bbejeck <bbejeck@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#1577 from bbejeck/KAFKA-3794-add-prefix-to-print-functions
There seems to be a bug in the JDK that on some versions the mtime of
the file is modified on FileChannel.truncate() even if the javadoc states
`If the given size is greater than or equal to the file's current size then
the file is not modified.`.
This causes problems with log retention, as all the files then look like
they contain recent data to Kafka. Therefore this is only done if the channel size is different to the target size.
Author: Moritz Siuts <m.siuts@emetriq.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1497 from msiuts/KAFKA-3802-log_mtimes_reset_on_broker_shutdown
`Windows.segments(...)` and `Windows.until(...)` currently aren't returning the `Window` with its type param `W`. This causes the generic type to be lost and therefore methods using this can't infer the correct return types.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1587 from dguy/windows-generics
The `KStream.groupBy(..)` calls don't change the value, only the key, so they don't need the type param `V1` as the new stream will always be of type `KStream<K1, V>`.
The `Serde` in the overloaded `groupBy` should have a type param of `V` to match the returned `KStream`
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Guozhang Wang
Closes#1584 from dguy/kstream-generics
follow-up to auto-through feature:
- add sourceNode to transform()
- enable auto-repartitioning in merge()
- null check not required anymore (always join-able due to auto-through)
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1580 from mjsax/hotfix
This patch fixes two issues:
1. Subsequent regex subscriptions fail with the new consumer.
2. Subsequent regex subscriptions would not immediately refresh metadata to change the subscription of the new consumer and trigger a rebalance.
The final note on the JIRA stating that a later created topic that matches a consumer's subscription pattern would not be assigned to the consumer upon creation seems to be as designed. A repeat
`subscribe()` to the same pattern or some wait time until the next automatic metadata refresh would handle that case.
An integration test was also added to verify these issues are fixed with this PR.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1572 from vahidhashemian/KAFKA-3854
The contribution is my original work and that I license the work to the project under the project's open source license.
Contributors: Guozhang Wang, Phil Derome
guozhangwang
Added checkEmpty to validate processor does nothing and added a inhibit check for filter to fix issue.
Author: Philippe Derome <phderome@gmail.com>
Author: Phil Derome <phderome@gmail.com>
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1556 from phderome/DEROME-3902
This is the part I of the work to add the StreamsConfig to ProcessorContext.
We need to access StreamsConfig in the ProcessorContext so other components (e.g. RocksDBWindowStore or LRUCache can retrieve config parameter from application)
Author: Henry Cai <hcai@pinterest.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1553 from HenryCaiHaiying/config
Current task assignment in TaskAssignor is not deterministic.
During cluster restart or rolling restart, we have the same set of participating worker nodes. But the current TaskAssignor is not able to maintain a deterministic mapping, so about 20% partitions will be reassigned which would cause state repopulation.
When the topology of work nodes (# of worker nodes, the TaskIds they are carrying with) is not changed, we really just want to keep the old task assignment.
Add the code to check whether the node topology is changing or not:
- when the prevAssignedTasks from the old clientStates is the same as the new task list
- when there is no new node joining (its prevAssignTasks would be either empty or conflict with some other nodes)
- when there is no node dropping out (the total of prevAssignedTasks from other nodes would not be equal to the new task list)
When the topology is not changing, we would just use the old mapping.
I also added the code to check whether the previous assignment is balanced (whether each node's task list is within [1/2 average -- 2 * average]), if it's not balanced, we will still start the a new task assignment.
Author: Henry Cai <hcai@pinterest.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1543 from HenryCaiHaiying/upstream
The reasons to remove it are:
1. It's currently broken. The purpose of the [JIRA](https://issues.apache.org/jira/browse/KAFKA-3761) was to report that the RunningAsController state gets overwritten back to "RunningAsBroker".
2. It's not a useful state.
a. If clients want to use this metric to know whether a broker is ready to receive requests or not, they do not care whether or not the broker is the controller
b. there is already a separate boolean property, KafkaController.isActive which contains this information.
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1437 from theduderog/KAFKA-3761-broker-state
ijuma i checked the cases where this test has failed and it seems to always be on the verification of the left join. I've ran this test plenty of times and i can't get it to fail. However in the interest of having stable builds, i've removed just the part of the test that is failing (which happens to be the last verification).
Thanks,
Damian
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#1549 from dguy/kafka-3896
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>
Closes#1550 from guozhangwang/Kminor-grouppartitioner-javadoc