It compares upper bound with itself.
Author: Edward Ribeiro <edward.ribeiro@gmail.com>
Reviewers: Aditya Auradkar, Ismael Juma, Guozhang Wang
Closes#182 from eribeiro/equals-bug
This work has been contributed by Jesse Anderson, Randall Hauch, Yasuhiro Matsuda and Guozhang Wang. The detailed design can be found in https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client.
Author: Guozhang Wang <wangguoz@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro.matsuda@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Author: ymatsuda <yasuhiro.matsuda@gmail.com>
Author: Randall Hauch <rhauch@gmail.com>
Author: Jesse Anderson <jesse@smokinghand.com>
Author: Ismael Juma <ismael@juma.me.uk>
Author: Jesse Anderson <eljefe6a@gmail.com>
Reviewers: Ismael Juma, Randall Hauch, Edward Ribeiro, Gwen Shapira, Jun Rao, Jay Kreps, Yasuhiro Matsuda, Guozhang Wang
Closes#130 from guozhangwang/streaming
hachikuji ewencp I found this problem when adding new consumer to mirror maker which commits offset in the rebalance callback. It is not clear to me why we are triggering rebalance for commitSync() and fetchCommittedOffset(). Can you help review to see if I miss something?
Regarding commitSync, After each poll() the partitions will be either assigned to a consumer or it will be already revoked. As long as user is using internal offset map, the offset map will always be valid. i.e. the offset map will always only contain the assigned partitions when commitSync is called. Hence there is no need to trigger a rebalance in commitSync().
The same guarantee also apply to fetchCommittedOffset(), isn't the only requirement is to ensure we know the coordinator?
Another related issue is that today the IllegalGenerationIdException is a bit confusing. When we receive an IllegalGenerationIdException from heartbeat, we need to use that same generation Id to commit offset and the coordinator will take it. So the generation ID was not really illegal. I will file a ticket for this issue.
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Jason Gustafson, Guozhang Wang
Closes#221 from becketqin/KAFKA-2555
Author: Dong Lin <lindong28@gmail.com>
Author: Dong Lin <lindong@cis.upenn.edu>
Reviewers: Jason Gustafson, Ismael Juma, Guozhang Wang
Closes#118 from lindong28/KAFKA-2390
Author: Parth Brahmbhatt <brahmbhatt.parth@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#195 from Parth-Brahmbhatt/KAFKA-2211
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Aditya Auradkar <aauradkar@linkedin.com>, Jun Rao <junrao@gmail.com>
Closes#194 from ijuma/kafka-2440-use-network-client-in-fetcher
A shot to remove commit type from new consumer. The coordinator constructor takes a default offset commit callback mainly for testing purpose.
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Ewen Cheslack-Postava, Jason Gustafson, Guohang Wang
Closes#134 from becketqin/KAFKA-2389
Author: Ashish Singh <asingh@cloudera.com>
Reviewers: Jason Gustafson, Guozhang Wang, Edward Ribeiro, Ismael Juma
Closes#128 from SinghAsDev/KAFKA-1893
ewencp
The changes here are smaller than they look - mostly refactoring/cleanup.
- ConsumerPerformanceService: added new_consumer flag, and exposed more command-line settings
- benchmark.py: refactored to use `parametrize` and `matrix` - this reduced some amount of repeated code
- benchmark.py: added consumer performance tests with new consumer (using `parametrize`)
- benchmark.py: added more detailed test descriptions
- performance.py: broke into separate files
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava, Jason Gustafson, Gwen Shapira
Closes#179 from granders/KAFKA-2489-benchmark-new-consumer
Small clarification to docs. Current behaviour could confuse when doing something like:
consumer.seekToEnd()
consumer.send(msg)
consumer.poll() //would return msg as seek evaluates lazily
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Gwen Shapira
Closes#199 from benstopford/minor-stuff
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Gwen Shapira <cshapi@gmail.com>
Closes#151 from ijuma/kafka-2411-remove-usage-of-blocking-channel
No Jira ticket created, as the Contributing Code Changes doc says it's not necessary for javadoc typo fixes.
Author: Magnus Reftel <magnus.reftel@skatteetaten.no>
Reviewers: Gwen Shapira
Closes#186 from magnusr/feature/its
The sleep() in KafkaConsumer's poll blocked any pending IO from being completed and created a performance bottleneck. It was intended to implement the fetch backoff behavior, but that was a misunderstanding of the setting "retry.backoff.ms" which should only affect failed fetches.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ewen Cheslack-Postava, Gwen Shapira
Closes#180 from hachikuji/KAFKA-2486
The Converter class now translates directly between byte[] and Copycat's data
API instead of requiring an intermediate runtime type like Avro's GenericRecord
or Jackson's JsonNode.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Gwen Shapira
Closes#172 from ewencp/kafka-2475-unified-serializer-converter and squashes the following commits:
566c52f [Ewen Cheslack-Postava] Checkstyle fixes
320d0df [Ewen Cheslack-Postava] Restrict offset format.
85797e7 [Ewen Cheslack-Postava] Add StringConverter for using Copycat with raw strings.
698d65c [Ewen Cheslack-Postava] Move and update outdated comment about handing of types for BYTES type in Copycat.
4bed051 [Ewen Cheslack-Postava] KAFKA-2475: Make Copycat only have a Converter class instead of Serializer, Deserializer, and Converter.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Edward Ribeiro, Onur Karaman, Ismael Juma, Guozhang Wang
Closes#139 from hachikuji/KAFKA-2388 and squashes the following commits:
377c67e [Jason Gustafson] KAFKA-2388; refactor KafkaConsumer subscribe API
This PR adds StopReplica request and response as it is required by ijuma for KAFKA-2411. Migration of core module is addressed a separate PR (#141).
ijuma Could you review it? gwenshap Could you take a look as well?
Author: David Jacot <david.jacot@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#170 from dajac/KAFKA-2072-part-1
This is an initial patch implementing the basics of Copycat for KIP-26.
The intent here is to start a review of the key pieces of the core API and get a reasonably functional, baseline, non-distributed implementation of Copycat in place to get things rolling. The current patch has a number of known issues that need to be addressed before a final version:
* Some build-related issues. Specifically, requires some locally-installed dependencies (see below), ignores checkstyle for the runtime data library because it's lifted from Avro currently and likely won't last in its current form, and some Gradle task dependencies aren't quite right because I haven't gotten rid of the dependency on `core` (which should now be an easy patch since new consumer groups are in a much better state).
* This patch currently depends on some Confluent trunk code because I prototyped with our Avro serializers w/ schema-registry support. We need to figure out what we want to provide as an example built-in set of serializers. Unlike core Kafka where we could ignore the issue, providing only ByteArray or String serializers, this is pretty central to how Copycat works.
* This patch uses a hacked up version of Avro as its runtime data format. Not sure if we want to go through the entire API discussion just to get some basic code committed, so I filed KAFKA-2367 to handle that separately. The core connector APIs and the runtime data APIs are entirely orthogonal.
* This patch needs some updates to get aligned with recent new consumer changes (specifically, I'm aware of the ConcurrentModificationException issue on exit). More generally, the new consumer is in flux but Copycat depends on it, so there are likely to be some negative interactions.
* The layout feels a bit awkward to me right now because I ported it from a Maven layout. We don't have nearly the same level of granularity in Kafka currently (core and clients, plus the mostly ignored examples, log4j-appender, and a couple of contribs). We might want to reorganize, although keeping data+api separate from runtime and connector plugins is useful for minimizing dependencies.
* There are a variety of other things (e.g., I'm not happy with the exception hierarchy/how they are currently handled, TopicPartition doesn't really need to be duplicated unless we want Copycat entirely isolated from the Kafka APIs, etc), but I expect those we'll cover in the review.
Before commenting on the patch, it's probably worth reviewing https://issues.apache.org/jira/browse/KAFKA-2365 and https://issues.apache.org/jira/browse/KAFKA-2366 to get an idea of what I had in mind for a) what we ultimately want with all the Copycat patches and b) what we aim to cover in this initial patch. My hope is that we can use a WIP patch (after the current obvious deficiencies are addressed) while recognizing that we want to make iterative progress with a bunch of subsequent PRs.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Ismael Juma, Gwen Shapira
Closes#99 from ewencp/copycat and squashes the following commits:
a3a47a6 [Ewen Cheslack-Postava] Simplify Copycat exceptions, make them a subclass of KafkaException.
8c108b0 [Ewen Cheslack-Postava] Rename Coordinator to Herder to avoid confusion with the consumer coordinator.
7bf8075 [Ewen Cheslack-Postava] Make Copycat CLI speific to standalone mode, clean up some config and get rid of config storage in standalone mode.
656a003 [Ewen Cheslack-Postava] Clarify and expand the explanation of the Copycat Coordinator interface.
c0e5fdc [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
0fa7a36 [Ewen Cheslack-Postava] Mark Copycat classes as unstable and reduce visibility of some classes where possible.
d55d31e [Ewen Cheslack-Postava] Reorganize Copycat code to put it all under one top-level directory.
b29cb2c [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
d713a21 [Ewen Cheslack-Postava] Address Gwen's review comments.
6787a85 [Ewen Cheslack-Postava] Make Converter generic to match serializers since some serialization formats do not require a base class of Object; update many other classes to have generic key and value class type parameters to match this change.
b194c73 [Ewen Cheslack-Postava] Split Copycat converter option into two options for key and value.
0b5a1a0 [Ewen Cheslack-Postava] Normalize naming to use partition for both source and Kafka, adjusting naming in CopycatRecord classes to clearly differentiate.
e345142 [Ewen Cheslack-Postava] Remove Copycat reflection utils, use existing Utils and ConfigDef functionality from clients package.
be5c387 [Ewen Cheslack-Postava] Minor cleanup
122423e [Ewen Cheslack-Postava] Style cleanup
6ba87de [Ewen Cheslack-Postava] Remove most of the Avro-based mock runtime data API, only preserving enough schema functionality to support basic primitive types for an initial patch.
4674d13 [Ewen Cheslack-Postava] Address review comments, clean up some code styling.
25b5739 [Ewen Cheslack-Postava] Fix sink task offset commit concurrency issue by moving it to the worker thread and waking up the consumer to ensure it exits promptly.
0aefe21 [Ewen Cheslack-Postava] Add log4j settings for Copycat.
220e42d [Ewen Cheslack-Postava] Replace Avro serializer with JSON serializer.
1243a7c [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
5a618c6 [Ewen Cheslack-Postava] Remove offset serializers, instead reusing the existing serializers and removing schema projection support.
e849e10 [Ewen Cheslack-Postava] Remove duplicated TopicPartition implementation.
dec1379 [Ewen Cheslack-Postava] Switch to using new consumer coordinator instead of manually assigning partitions. Remove dependency of copycat-runtime on core.
4a9b4f3 [Ewen Cheslack-Postava] Add some helpful Copycat-specific build and test targets that cover all Copycat packages.
31cd1ca [Ewen Cheslack-Postava] Add CLI tools for Copycat.
e14942c [Ewen Cheslack-Postava] Add Copycat file connector.
0233456 [Ewen Cheslack-Postava] Add copycat-avro and copycat-runtime
11981d2 [Ewen Cheslack-Postava] Add copycat-data and copycat-api
This also marks the consumer as unstable to show an example of using these annotations.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Gwen Shapira
Closes#133 from ewencp/stability-annotations and squashes the following commits:
09c15c3 [Ewen Cheslack-Postava] KAFKA-2429: Add annotations to mark classes as stable/unstable
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma, Ewen Cheslack-Postava and Guozhang Wang
Closes#131 from granthenke/minor-string and squashes the following commits:
3c6250d [Grant Henke] MINOR: Fix hard coded strings in ProduceResponse
Author: Ismael Juma <ismael@juma.me.uk>
Closes#126 from ijuma/minor-selector-javadoc-fixes and squashes the following commits:
a26f529 [Ismael Juma] Minor fixes to Selector's javadoc
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang
Closes#112 from hachikuji/KAFKA-2340 and squashes the following commits:
cc49ca2 [Jason Gustafson] KAFKA-2340; improve KafkaConsumer Fetcher test coverage
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang
Closes#116 from hachikuji/KAFKA-2400 and squashes the following commits:
3c1b1dd [Jason Gustafson] KAFKA-2400; expose heartbeat interval in KafkaConsumer configuration
If no offset has been committed, then committed method does not return (null) value, instead NoOffsetForPartitionException is thrown in that case.
Author: Stevo Slavić <sslavic@gmail.com>
Reviewers: Ismael, Guozhang
Closes#89 from sslavic/patch-4 and squashes the following commits:
5c0a152 [Stevo Slavić] MINOR: Fixed javadoc for committed return value
In this commit 0699ff2ce6 (diff-5533ddc72176acd1c32f5abbe94aa672) among other things auto.offset.reset possible options were changed from smallest to earliest and from largest to latest, but not in documentation for that configuration property.
This patch fixes documentation for auto.offset.reset consumer configuration property so it is in sync with validation logic.
Author: Stevo Slavić <sslavic@gmail.com>
Reviewers: Jason, Ismael, Guozhang
Closes#91 from sslavic/patch-5 and squashes the following commits:
f4c9656 [Stevo Slavić] MINOR: auto.offset.reset docs not in sync with validation
ConsumerRecords has records organized per topic partition, not per topic as ConsumerRecords javadoc suggested.
Author: Stevo Slavić <sslavic@gmail.com>
Reviewers: Jason, Guozhang
Closes#92 from sslavic/patch-6 and squashes the following commits:
b08a58d [Stevo Slavić] MINOR: ConsumerRecords are organized per topic partition
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael, Ashish, Guozhang
Closes#100 from hachikuji/KAFKA-2350 and squashes the following commits:
250e823 [Jason Gustafson] KAFKA-2350; KafkaConsumer pause/resume API