90bbeedf52 introduced a regression resulting in passing an action per resource
name to the `Authorizer` instead of passing one per unique resource name. Refactor
the signatures of both `filterAuthorized` and `authorize` to make them easier to test
and add a test for each.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Some tasks get closed inside HandleAssignment and did not remove from the task manager bookkeep list. The next time they would be re-closed which is illegal state.
Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
The upper limit offset is displayed incorrectly in the log cleaner summary message. For example:
```
Log cleaner thread 0 cleaned log __consumer_offsets-47 (dirty section = [358800359, 358800359])
```
We should be using the next dirty offset as the upper limit.
Reviewers: David Arthur <mumrah@gmail.com>
On metadata change for assigned topics, we trigger rebalance, revoke partitions and send JoinGroup. If metadata reverts to the original value and JoinGroup fails, we don't resend JoinGroup because we don't set `rejoinNeeded`. This PR sets `rejoinNeeded=true` when rebalance is triggered due to metadata change to ensure that we retry on failure.
Reviewers: Boyang Chen <boyang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
Instance-level:
* number of alive stream threads
Thread-level:
* avg / max number of records polled from the consumer per runOnce, INFO
* avg / max number of records processed by the task manager (i.e. across all tasks) per runOnce, INFO
Task-level:
* number of current buffered records at the moment (i.e. it is just a dynamic gauge), DEBUG.
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <john@confluent.io>
As title suggests, we would like to broaden this check so that we don't fail to close a doom-to-cleanup task.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
As title suggested, consumers would first do an OffsetFetch before starting the normal processing. It makes sense to add it to the concurrent test suite to verify whether there would be a blocking behavior.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-8889 attempted to fill in the missing stacktrace in the log message when handling errors in FetchSessionHandler#handleError
But the fix is not effective without KAFKA-7016
The current fix removes the redundant pair of braces {} at the end of the log message. If and when the Throwable that is passed as argument to this method has a stacktrace, the log message will include it. Currently it doesn't because the Throwable argument does not have a stacktrace.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
If the high-watermark is updated in the middle of a read with the `read_committed` isolation level, it is possible to return data above the LSO. In the worst case, this can lead to the read of an aborted transaction. The root cause is that the logic depends on reading the high-watermark twice. We fix the problem by reading it once and caching the value.
Reviewers: David Arthur <mumrah@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Document the supported endpoint at the top-level (root) REST API resource and the information that it returns when a request is made to a Connect worker.
Fixes an omission in documentation after KAFKA-2369 and KAFKA-6311 (KIP-238)
Reviewers: Toby Drake <tobydrake7@gmail.com>, Soenke Liebau <soenke.liebau@opencore.com>
First set of cleanup pushed to followup PR after KIP-441 Pt. 5. Main changes are:
1. Moved `RankedClient` and the static `buildClientRankingsByTask` to a new file
2. Moved `Movement` and the static `getMovements` to a new file (also renamed to `TaskMovement`)
3. Consolidated the many common variables throughout the assignment tests to the new `AssignmentTestUtils`
4. New utility to generate comparable/predictable UUIDs for tests, and removed the generic from `TaskAssignor` and all related classes
Reviewers: John Roesler <vvcephei@apache.org>, Andrew Choi <a24choi@edu.uwaterloo.ca>
There is a race on receiving a LeaderAndIsr request for a replica with an active log dir reassignment. If the reassignment completes just before the LeaderAndIsr handler updates epoch information, it can lead to an illegal state error since no future log dir exists. This patch fixes the problem by ensuring that the future log dir exists when the fetcher is started. Removal cannot happen concurrently because it requires access the same partition state lock.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>
Co-authored-by: Chia-Ping Tsai <chia7712@gmail.com>
The runtime type of Metric.metricValue() needn't always be a Double,
for example, if it's a gauge from IntGaugeSuite.
Since it's impossible to format non-double values with 3 point precision
IllegalFormatConversionException resulted.
Author: Tom Bentley <tbentley@redhat.com>
Author: Tom Bentley <tombentley@users.noreply.github.com>
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#8373 from tombentley/KAFKA-9775-IllegalFormatConversionException
For some context, when building a streams application, the optimizer keeps track of the key-changing operations and any repartition nodes that are descendants of the key-changer. During the optimization phase (if enabled), any repartition nodes are logically collapsed into one. The optimizer updates the graph by inserting the single repartition node between the key-changing node and its first child node. This graph update process is done by searching for a node that has the key-changing node as one of its direct parents, and the search starts from the repartition node, going up in the parent hierarchy.
The one exception to this rule is if there is a merge node that is a descendant of the key-changing node, then during the optimization phase, the map tracking key-changers to repartition nodes is updated to have the merge node as the key. Then the optimization process updates the graph to place the single repartition node between the merge node and its first child node.
The error in KAFKA-9739 occurred because there was an assumption that the repartition nodes are children of the merge node. But in the topology from KAFKA-9739, the repartition node was a parent of the merge node. So when attempting to find the first child of the merge node, nothing was found (obviously) resulting in StreamException(Found a null keyChangingChild node for..)
This PR fixes this bug by first checking that all repartition nodes for optimization are children of the merge node.
This PR includes a test with the topology from KAFKA-9739.
Reviewers: John Roesler <john@confluent.io>
Revert the decision for the sendOffsetsToTransaction(groupMetadata) API to fail with old version of brokers for the sake of making the application easier to adapt between versions. This PR silently downgrade the TxnOffsetCommit API when the build version is small than 3.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
As documented in the KIP:
We shall set `transaction.timout.ms` default to 10000 ms (10 seconds) on Kafka Streams.
Reviewer: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Adds a new TaskAssignor implementation, currently hidden behind an internal feature flag, that implements the high availability algorithm of KIP-441.
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
Rename the test suite to later add unit tests that don't depend on
ZK or the AdminClient TopiCommand types.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Andrew Olson <aolson1@cerner.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Once Scala 2.13.2 is officially released, I will submit a follow up PR
that enables `-Xfatal-warnings` with the necessary warning
exclusions. Compiler warning exclusions were only introduced in 2.13.2
and hence why we have to wait for that. I used a snapshot build to
test it in the meantime.
Changes:
* Remove Deprecated annotation from internal request classes
* Class.newInstance is deprecated in favor of
Class.getConstructor().newInstance
* Replace deprecated JavaConversions with CollectionConverters
* Remove unused kafka.cluster.Cluster
* Don't use Map and Set methods deprecated in 2.13:
- collection.Map +, ++, -, --, mapValues, filterKeys, retain
- collection.Set +, ++, -, --
* Add scala-collection-compat dependency to streams-scala and
update version to 2.1.4.
* Replace usages of deprecated Either.get and Either.right
* Replace usage of deprecated Integer(String) constructor
* `import scala.language.implicitConversions` is not needed in Scala 2.13
* Replace usage of deprecated `toIterator`, `Traversable`, `seq`,
`reverseMap`, `hasDefiniteSize`
* Replace usage of deprecated alterConfigs with incrementalAlterConfigs
where possible
* Fix implicit widening conversions from Long/Int to Double/Float
* Avoid implicit conversions to String
* Eliminate usage of deprecated procedure syntax
* Remove `println`in `LogValidatorTest` instead of fixing the compiler
warning since tests should not `println`.
* Eliminate implicit conversion from Array to Seq
* Remove unnecessary usage of 3 argument assertEquals
* Replace `toStream` with `iterator`
* Do not use deprecated SaslConfigs.DEFAULT_SASL_ENABLED_MECHANISMS
* Replace StringBuilder.newBuilder with new StringBuilder
* Rename AclBuffers to AclSeqs and remove usage of `filterKeys`
* More consistent usage of Set/Map in Controller classes: this also fixes
deprecated warnings with Scala 2.13
* Add spotBugs exclusion for inliner artifact in KafkaApis with Scala 2.12.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
We find that brokers may send empty assignment for some members unexpectedly, and would need more logs investigating this issue.
Reviewers: John Roesler <vvcephei@apache.org>
Measure the percentage ratio the stream thread spent on processing each task among all assigned active tasks (KIP-444). Also add unit tests to cover the added metrics in this PR and the previous #8358. Also trying to fix the flaky test reported in KAFKA-5842
Co-authored-by: John Roesler <vvcephei@apache.org>
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
This patch addresses a locking issue with DelayTxnMarker completion. Because of the reliance on the shared read lock in TransactionStateManager and the deadlock avoidance algorithm in `DelayedOperation`, we cannot guarantee that a call to checkAndComplete will offer an opportunity to complete the job. This patch removes the reliance on this lock in two ways:
1. We replace the transaction marker purgatory with a map of transaction with pending markers. We were not using purgatory expiration anyway, so this avoids the locking issue and simplifies usage.
2. We were also relying on the read lock for the `DelayedProduce` completion when calling `ReplicaManager.appendRecords`. As far as I can tell, this was not necessary. The lock order is always 1) state read/write lock, 2) txn metadata locks. Since we only call `appendRecords` while holding the read lock, a deadlock does not seem possible.
Reviewers: Jun Rao <junrao@gmail.com>
* Fixed DataException thrown when handling tombstone events with null value
* Passes through original record when finding a null key when it's configured for keys or a null value when it's configured for values.
* Added unit tests for schema and schemaless data
The tasks `unitTest` and `integrationTest` used to run tests don't exclude
the ```**/*Suite``` so the tests included by Suite class are executed twice.
For example:
```
11:42:25 org.apache.kafka.streams.integration.StoreQuerySuite > org.apache.kafka.streams.integration.QueryableStateIntegrationTest.shouldBeAbleToQueryMapValuesState STARTED
11:42:26
11:42:26 org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > shouldKStreamGlobalKTableJoin PASSED
11:42:30
11:42:30 org.apache.kafka.streams.integration.StoreQuerySuite > org.apache.kafka.streams.integration.QueryableStateIntegrationTest.shouldBeAbleToQueryMapValuesState PASSED
...
11:48:42 org.apache.kafka.streams.integration.QueryableStateIntegrationTest > shouldBeAbleToQueryMapValuesState STARTED
11:48:46
11:48:46 org.apache.kafka.streams.integration.QueryableStateIntegrationTest > shouldBeAbleToQueryMapValuesState PASSED
```
For consistency, move the existing exclusion for the `test` task from the
`streams` project to `subprojects`.
Reviewers: Ismael Juma <ismael@juma.me.uk>
In case of an error while flattening a record with schema, the Flatten transformation was reporting an error about a record without schema, as follows:
```
org.apache.kafka.connect.errors.DataException: Flatten transformation does not support ARRAY for record without schemas (for field ...)
```
The expected behaviour would be an error message specifying "with schemas".
This looks like a simple copy/paste typo from the schemaless equivalent methods, in the same file
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Konstantine Karantasis <konstantine@confluent.io>
Simple doc fix in a code snippet in connect.html
Co-authored-by: Scott Ferguson <smferguson@gmail.com>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Konstantine Karantasis <konstantine@confluent.io>
Add missing @Override annotations and use lambdas when declaring threads to suppress warnings in IDEs and improve readability.
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>
This PR enhances the epoch checking logic for endTransaction call in TransactionCoordinator. Previously it relaxes the checking by allowing a producer epoch bump, which is error-prone since there is no reason to see a producer epoch bump from client.
Reviewers: Jason Gustafson <jason@confluent.io>
When a caching state store is closed it calls its flush() method.
If flush() throws an exception the underlying state store is not closed.
This commit ensures that state stores underlying a wrapped state stores
are closed even when preceding operations in the close method throw.
Co-authored-by: John Roesler <vvcephei@apache.org>
Reviewers: John Roesler <vvcephei@apache.org>, Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
For reasons outlined in https://issues.apache.org/jira/browse/KAFKA-9771
we can't upgrade to a version of Jetty with the bug fixed, or downgrade to one prior to the introduction of the bug. Luckily, the actual fix is pretty straightforward and can be ported over to Connect for use until it's possible to upgrade to a version of Jetty with that bug fixed: https://github.com/eclipse/jetty.project/pull/4404/files#diff-58640db0f8f2cd84b7e653d1c1540913R2188-R2193
The changes here have been verified locally; a test with multiple certificates/multiple hostnames will be submitted in a follow up.
Reviewers: Jeff Huang <47870461+jeffhuang26@users.noreply.github.com>, Konstantine Karantasis <konstantine@confluent.io>