Besides API and runtime changes, this PR also includes 2 data transformations (`InsertField`, `HoistToStruct`) and 1 routing transformation (`TimestampRouter`).
There is some gnarliness in `ConnectorConfig` / `ConfigDef` around creating, parsing and validating a dynamic `ConfigDef`.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2299 from shikhar/smt-2017
Changed caching in LoginManager to allow one LoginManager per client
JAAS configuration.
Added test to End2EndAuthorization for SASL Plain and GSSAPI with two
consumers with different credentials.
Developed with mimaison.
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2293 from edoardocomar/KAFKA-4180d
Kafka Streams: add granular metrics per node and per task, also expose ability to register non latency metrics in StreamsMetrics
Also added different recording levels to Metrics.
This is joint contribution from Eno Thereska and Aarti Gupta.
from https://github.com/apache/kafka/pull/1362#issuecomment-218326690-------
We can consider adding metrics for process / punctuate / commit rate at the granularity of each processor node in addition to the global rate mentioned above. This is very helpful in debugging.
We can consider adding rate / total cumulated metrics for context.forward indicating how many records were forwarded downstream from this processor node as well. This is helpful in debugging.
We can consider adding metrics for each stream partition timestamp.
This is helpful in debugging.
## Besides the latency metrics, we can also add throughput latency in terms of source records consumed.
More discussions here https://issues.apache.org/jira/browse/KAFKA-3715, KIP-104, KIP-105
Author: Eno Thereska <eno@confluent.io>
Author: Aarti Gupta <aartiguptaa@gmail.com>
Reviewers: Greg Fodor, Ismael Juma, Damian Guy, Guozhang Wang
Closes#1446 from aartigupta/trunk
The client should send older versions of requests to the broker if necessary.
Author: Colin P. Mccabe <cmccabe@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2264 from cmccabe/KAFKA-4507
Author: yaojuncn <yaojuncn@users.noreply.github.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Vahid Hashemian <vahidhashemian@us.ibm.com>, Konstantin <konstantin@tubemogul.com>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2128 from yaojuncn/KAFKA-4402-client-producer-round-robin-fix
Removed the extra ',' character while printing the replicas / in-sync replicas
array.
Author: Kamal <kamal@nmsworks.co.in>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2306 from Kamal15/trunk
ProducerConfig calls AbstractConfig.init where does the logs. KafkaProducer init will inovoke ProducerConfig.init twice that leads to logging twice.
Author: huxi <huxi@zhenrongbao.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2307 from amethystic/kafka-4434_Kafkaproducer_log_twice
Jason recently cleaned things up significantly by consolidating the Message/Record classes
into the common Java code in the clients module. While reviewing that, I noticed a few things
that could be improved a little more. To make reviewing easier, there will be multiple PRs.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Jason Gustafson <jason@confluent.io>
Closes#2271 from ijuma/records-minor-fixes
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Jason Gustafson <jason@confluent.io>
Closes#2155 from lindong28/KAFKA-4429
The original Javadoc description for `ConsumerRecord` is slightly confusing in that it can be read in a way such that an object is a key value pair received from Kafka, but (only) consists of the metadata associated with the record. This PR makes it clearer that the metadata is _included_ with the record, and moves the comma so that the phrase "topic name and partition number" in the sentence is more closely associated with the phrase "from which the record is being received".
Author: LoneRifle <LoneRifle@users.noreply.github.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2290 from LoneRifle/patch-1
Author: MURAKAMI Masahiko <fossamagna2@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2265 from fossamagna/fix-lz4outputstream-close
This was changed in b58b6a1bef and caused the `ReplicaVerificationToolTest.test_replica_lags`
system test to start failing.
I also added a unit test and a couple of other minor clean-ups.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2280 from ijuma/kafka-4554-fix-replica-buffer-verify-checksum
In case of file record truncation during write due to improper types usage
(`AtomicInteger` in place of `int`) `IllegalFormatConversionException` would
be thrown instead of `KafkaException`
Author: Kamil Szymanski <kamil.szymanski.dev@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2275 from kamilszymanski/file_record_truncation_during_write
Author: Ashish Singh <asingh@cloudera.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Colin P. Mccabe <cmccabe@confluent.io>, Dana Powers <dana.powers@gmail.com>, Gwen Shapira <cshapi@gmail.com>, Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1251 from SinghAsDev/KAFKA-3600
The latter return `Iterable` instead of `Iterator` so that enhanced foreach can be used
in Java.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2261 from ijuma/deepEntries-shallowEntries
Tasks that don't have any `StateStore`s wont have a `StandbyTask`, so `createStandbyTask` can return `null`. We need to check for this in `StandbyTaskCreator.createTask(...)`
Also, the checkpointed offsets for `StandbyTask`s are never loaded.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang
Closes#2255 from dguy/kafka-4539
Fix OffsetIndex overflow when replicating a highly compacted topic.
https://issues.apache.org/jira/browse/KAFKA-4451
Author: Michael Schiff <schiff.michael@gmail.com>
Author: Michael Schiff <michael.schiff@tubemogul.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2210 from michaelschiff/bug/4451
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#2140 from hachikuji/KAFKA4390
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2193 from enothereska/KAFKA-4405-prefetch
Improve consumer metric collection by collecting and recording metrics per topic.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1684 from vahidhashemian/KAFKA-4000
Collecting socket server metrics during shutdown may throw NullPointerException
Author: Xavier Léauté <xavier@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#2221 from xvrl/fix-metrics-npe-on-shutdown
Also:
* Make all implementations of `Time` thread-safe as they are accessed from multiple threads in some cases.
* Change default implementation of `MockTime` to use two separate variables for `nanoTime` and `currentTimeMillis` as they have different `origins`.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Shikhar Bhushan <shikhar@confluent.io>, Jason Gustafson <jason@confluent.io>, Eno Thereska <eno.thereska@gmail.com>, Damian Guy <damian.guy@gmail.com>
Closes#2095 from ijuma/kafka-2247-consolidate-time-interfaces
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#2190 from hachikuji/KAFKA-4469
Author: Antony Stubbs <antony.stubbs@gmail.com>
Reviewers: Eno Thereska <eno.thereska@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2157 from astubbs/trunk
Process requests received from channels before they were closed. For consumers, wait for coordinator requests to complete before returning from close.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1836 from rajinisivaram/KAFKA-3703
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Jason Gustafson <jason@confluent.io>
Closes#1827 from mimaison/KAFKA-4081
[KAFKA-4284](https://issues.apache.org/jira/browse/KAFKA-4284)
Even though Partitioner has a close method it is not closed when the producer is closed. Serializers, interceptors and metrics are all closed, so partitioners should be closed to.
To be able to use the same mechanism to close the partitioner as the serializers, etc. I had to make the `Partitioner` interface extend `Closeable`. Since this doesn't change the interface that feels ok and should be backwards compatible.
Looking at [KAFKA-2091](https://issues.apache.org/jira/browse/KAFKA-2091) (d6c45c70fb) that introduced the `Partitioner` interface it looks like the intention was that the producer should close the partitioner.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Theo <theo@iconara.net>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2000 from iconara/kafka-4284
I think the Javadoc should describe what happens if wakeup is called and no other thread is currently blocking. This may be important in some cases, e.g. trying to shut down a poll thread, followed by manually committing offsets.
Author: Stig Rohde Døssing <sdo@it-minds.dk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2093 from srdo/minor-expand-wakeup-javadoc
Fixes a bug that inappropriately applies backoff as interval between metadata updates even though the current one is outdated.
Author: Yuto Kawamura <kawamuray.dadada@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#1707 from kawamuray/KAFKA-4024-metadata-backoff
1. Create a new `ClientMetadata` to collapse `Set<String> consumerMemberIds`, `ClientState<TaskId> state`, and `HostInfo hostInfo`.
2. Stop reusing `stateChangelogTopicToTaskIds` and `internalSourceTopicToTaskIds` to access the (sub-)topology's internal repartition and changelog topics for clarity; also use the source topics num.partitions to set the num.partitions for repartition topics, and clarify to NOT have cycles since otherwise the while loop will fail.
3. `ensure-copartition` at the end to modify the number of partitions for repartition topics if necessary to be equal to other co-partition topics.
4. Refactor `ClientState` as well and update the logic of `TaskAssignor` for clarity as well.
5. Change default `clientId` from `applicationId-suffix` to `applicationId-processId` where `processId` is an UUID to avoid conflicts of clientIds that are from different JVMs, and hence conflicts in metrics.
6. Enforce `assignment` partitions to have the same size, and hence 1-1 mapping to `activeTask` taskIds.
7. Remove the `AssignmentSupplier` class by always construct the `partitionsByHostState` before assigning tasks to consumers within a client.
8. Remove all unnecessary member variables in `StreamPartitionAssignor`.
9. Some other minor fixes on unit tests, e.g. remove `test only` functions with java field reflection.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Xavier Léauté, Matthias J. Sax, Eno Thereska, Jason Gustafson
Closes#2012 from guozhangwang/K4117-stream-partitionassignro-cleanup
There should be only one cases where these clean-ups have a functional impact: replaced repeated identical logs with a single log for the stale controller epoch case.
The rest should just make the code easier to read and make it a bit less wasteful. I did this exercise because unused variables sometimes mask bugs.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1985 from ijuma/remove-unused
Increase timeout in test to avoid transient failures due to long GC or slow machine.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#2057 from rajinisivaram/KAFKA-2089
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2031 from hachikuji/KAFKA-4303
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#1995 from kkonstantine/KAFKA-4254-Update-producers-metadata-before-failing-on-non-existent-partition