Newer clients were getting stuck entering the validation phase even when a broker didn't support it. This commit will bypass the AWAITING_VALIDATION state when the broker is on an older version of the OffsetsForLeaderEpoch RPC.
Also fixes a system test by configuring the HATA to perform a one-shot balanced assignment
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Bruno Cadonna <bruno@confluent.io>
Implements KIP-401:
- Add ConnectedStoreProvider interface
- let Processor/[*]Transformer[*]Suppliers extend ConnectedStoreProvider
- allows to add and connect state stores to processors/transformers implicitly
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
Kafka-8538(#6957) has already added `group.instance.id` to `MemberDescription` but didn't print it in the describe group output, so this patch adds the logic to do so.
Before the change, the describe command prints as follows:
```
GROUP CONSUMER-ID HOST CLIENT-ID #PARTITIONS
DemoConsumer consumer-DemoConsumer-2-89251f12-f0ae-4dc1-a118-bda49f2a6e86 /127.0.0.1 consumer-DemoConsumer-2 0
DemoConsumer consumer-DemoConsumer-1-72221c6b-f3d9-4c68-96db-ffffa12ddf93 /127.0.0.1 consumer-DemoConsumer-1 1
```
After the change, the describe command prints as follows:
```
GROUP CONSUMER-ID GROUP-INSTANCE-ID HOST CLIENT-ID #PARTITIONS
DemoConsumer groupIns2-f050379c-9c0d-433c-bbe0-44de6177b60d groupIns2 /127.0.0.1 consumer-DemoConsumer-groupIns2 0
DemoConsumer groupIns1-44805ba9-ae6f-49d3-89af-44a4b95aff8d groupIns1 /127.0.0.1 consumer-DemoConsumer-groupIns1 1
```
If all the `GROUP-INSTANCE-ID` is null, just as the previous:
```
GROUP CONSUMER-ID HOST CLIENT-ID #PARTITIONS
DemoConsumer consumer-DemoConsumer-2-89251f12-f0ae-4dc1-a118-bda49f2a6e86 /127.0.0.1 consumer-DemoConsumer-2 0
DemoConsumer consumer-DemoConsumer-1-72221c6b-f3d9-4c68-96db-ffffa12ddf93 /127.0.0.1 consumer-DemoConsumer-1 1
```
Reviewers: Alice <WheresAlice@users.noreply.github.com>, Matthias J. Sax <matthias@confluent.io>, Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>
Kafka Connect workers have been able to create Connect's internal topics using the new admin client for some time now (see KAFKA-4667 for details). However, tasks of source connectors are still relying upon the broker to auto-create topics with default config settings if they don't exist, or expect these topics to exist before the connector is deployed, if their configuration needs to be specialized.
With the implementation of KIP-158 here, if `topic.creation.enable=true`, Kafka Connect will supply the source tasks of connectors that are configured to create topics with an admin client that will allow them to create new topics on-the-fly before writing the first source records to a new topic. Additionally, each source connector has the opportunity to customize the topic-specific settings of these new topics by defining groups of topic configurations.
This feature is tested here via unit tests (old tests that have been adjusted and new ones) as well as integration tests.
Reviewers: Randall Hauch <rhauch@gmail.com>
Sometimes logging leaves us guessing at the cause of an increment to the log start offset. Since this results in deletion of user data, we should provide the reason explicitly.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
We unnecessarily iterate the versions list each time we lookup
lastVersion, including in the hotpath Log.appendAsFollower.
Given that allVersions is a constant, this is unnecessary.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Although the statuses for tasks are removed from the status store when their connector is deleted, their statuses are not removed when only the task is deleted, which happens in the case that the number of tasks for a connector is reduced.
This commit adds logic for deleting the statuses for those tasks from the status store whenever a rebalance has completed and the leader of a distributed cluster has detected that there are recently-deleted tasks. Standalone is also updated to accomplish this.
Unit tests for the `DistributedHerder` and `StandaloneHerder` classes are updated and an integration test has been added.
Reviewers: Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
Added access to OffsetStorageReader from SourceConnector per KIP-131.
Added two interfaces SinkConnectorContext/SourceConnectContext that extend ConnectorContext in order to expose an OffsetStorageReader instance.
Added unit tests for Connector, SinkConnector and SourceConnector default methods
Author: Florian Hussonnois <florian.hussonnois@gmail.com>, Randall Hauch <rhauch@gmail.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
"console-producer" supports the setting of "client.id", which is a reasonable requirement, and the way "console consumer" and "console producer" handle "client.id" can be unified. "client.id" defaults to "console-producer"
Co-authored-by: xinzhuxiansheng <xinzhuxiansheng@autohome.com.cn>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Implemented KIP-437 by adding a new optional configuration property for the `MaskField` transformation that allows users to define a replacement literal for specific fields in matching records.
Author: Valeria Vasylieva <valeria.vasylieva@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
Added support for customizing the HTTP response headers for Kafka Connect as described in KIP-577.
Author: Jeff Huang <jeff.huang@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
The changes made in KIP-454 involved adding a `connectorConfig` method to the ConnectClusterState interface that REST extensions could use to query the worker for the configuration of a given connector. The implementation for this method returns the Java `Map` that's stored in the worker's view of the config topic (when running in distributed mode). No copying is performed, which causes mutations of that `Map` to persist across invocations of `connectorConfig` and, even worse, propagate to the worker when, e.g., starting a connector.
In this commit the map is copied before it's returned to REST extensions.
An existing unit test is modified to ensure that REST extensions receive a copy of the connector config, not the original.
Reviewers: Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
Added support for -1 replication factor and partitions for distributed worker internal topics by expanding the allowed values for the internal topics’ replication factor and partitions from positive values to also include -1 to signify that the broker defaults should be used.
The Kafka storage classes were already constructing a `NewTopic` object (always with a replication factor and partitions) and sending it to Kafka when required. This change will avoid setting the replication factor and/or number of partitions on this `NewTopic` if the worker configuration uses -1 for the corresponding configuration value.
Also added support for extra settings for internal topics on distributed config, status, and offset internal topics.
Quite a few new tests were added to verify that the `TopicAdmin` utility class is correctly using the AdminClient, and that the `DistributedConfig` validators for these configurations are correct. Also added integration tests for internal topic creation, covering preexisting functionality plus the new functionality.
Author: Randall Hauch <rhauch@gmail.com>
Reviewer: Konstantine Karantasis <konstantine@confluent.io>
If the request timeout is larger than the rebalance timeout, we should use the former as the JoinGroup request timeout. This patch also includes some minor improvements to request/response logging in `NetworkClient` including adding the request timeout to the log message.
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
We should treat standbys similarly to active stateful tasks and
re-assign them to instances that are already caught-up on them
while we warm them up on the desired destination, instead of
immediately moving them to the destination.
Reviewers: Bruno Cadonna <bruno@confluent.io>
Previous to this fix a plugged-in verifiable client, such as
confluent-kafka-python, would be deployed on the node in the background
worker thread as the client was started. Since this could be time consuming
(e.g., 10+ seconds) and since the main test thread would continue to
operate, it was common for the current test to time out waiting
for e.g. the verifiable producer to produce messages while it was in fact
still deploying.
The fix here is to deploy the verifiable client on the node when
the verifiable client is instantiated, which is thus a blocking
operation on the main test thread, avoiding any test-based timeouts.
Reviewers: Jason Gustafson <jason@confluent.io>
At the time of this writing there are 6 schemas in kafka APIs with no fields - 3
versions each of LIST_GROUPS and API_VERSIONS.
When reading instances of these schemas off the wire there's little point in
returning a unique Struct object (or a unique values array inside that Struct)
since there is no payload.
Reviewers: Ismael Juma <ismael@juma.me.uk>
`MirrorTaskConfig` class mutates the `ConfigDef` by defining additional properties, which leads to a potential `ConcurrentModificationException` during worker configuration validation and unintended inclusion of those new properties in the `ConfigDef` for the connectors which in turn is then visible via the REST API's `/connectors/{name}/config/validate` endpoint.
The fix here is a one-liner that just creates a copy of the `ConfigDef` before defining new properties.
Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
Currently, if a connector is deleted, its task configurations will remain in the config snapshot tracked by the KafkaConfigBackingStore. This causes issues with incremental cooperative rebalancing, which utilizes that config snapshot to determine which connectors and tasks need to be assigned across the cluster. Specifically, it first checks to see which connectors are present in the config snapshot, and then, for each of those connectors, queries the snapshot for that connector's task configs.
The lifecycle of a connector is for its configuration to be written to the config topic, that write to be picked up by the workers in the cluster and trigger a rebalance, the connector to be assigned to and started by a worker, task configs to be generated by the connector and then written to the config topic, that write to be picked up by the workers in the cluster and trigger a second rebalance, and finally, the tasks to be assigned to and started by workers across the cluster.
There is a brief period in between the first time the connector is started and when the second rebalance has completed during which those stale task configs from a previously-deleted version of the connector will be used by the framework to start tasks for that connector. This fix aims to eliminate that window by preemptively clearing the task configs from the config snapshot for a connector whenever it has been deleted.
An existing unit test is modified to verify this behavior, and should provide sufficient guarantees that the bug has been fixed.
Reviewers: Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
The class claims to be immutable, but there are some mutable features of this class.
Increase the immutability of it and add a little cleanup:
* Pre-initialize size of ArrayList
* Remove superfluous syntax
* Use ArrayList instead of LinkedList since the list is created once
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>
Segmented state stores turn on bulk loading of the underlying RocksDB
when restoring. This is correct for segmented state stores that
are in restore mode on active tasks and the onRestoreStart() and
onRestoreEnd() in RocksDBSegmentsBatchingRestoreCallback take care
of toggling bulk loading mode on and off. However, restoreAll()
in RocksDBSegmentsBatchingRestoreCallback might also turn on bulk loading
mode. When this happens on a stand-by task bulk loading mode is never
turned off. That leads to steadily increasing open file decriptors
in RocksDB because in bulk loading mode RocksDB creates continuously new
files but never compacts them (which is the intended behaviour).
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
* MINOR: Fix typo in RecordAccumulator
* MINOR: Fix typo in several files
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>
Standby task could also at risk of getting into illegal state when not being closed during HandleLostAll:
1. The standby task was initializing as CREATED state, and task corrupted exception was thrown from registerStateStores
2. The task corrupted exception was caught, and do a non-affected task commit
3. The task commit failed due to task migrated exception
4. The handleLostAll didn't close the standby task, leaving it as CREATED state
5. Next rebalance complete, the same task was assigned back as standby task.
6. Illegal Argument exception caught as state store already registered
Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
As stated, we couldn't wait for handleRebalanceComplete in the case of handleLostAll, as we already closed the active task as dirty, and could potentially require its offset in the next thread.runOnce call.
Co-authored-by: Guozhang Wang <wangguoz@gmail.com>
Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Fixes EmbeddedKafkaCluster.deleteTopicAndWait for use with kafka_2.13
Reviewers: Boyang Chen <boyang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, John Roesler <vvcephei@apache.org>
fix broken links
rephrase a sentence
update the version number
Reviewers: Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.org>
This is a prerequisite for KAFKA-9501 and will also be useful for KAFKA-9603
There should be no logical changes here: the main difference is the removal of StandbyContextImpl in preparation for contexts to transition between active and standby.
Also includes some minor cleanup, eg pulling the ReadOnly/ReadWrite decorators out into a separate file.
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>, Guozhang Wang <wangguoz@gmail.com>