Only check if positions need validation if there is new metadata.
Also fix some inefficient java.util.stream code in the hot path of SubscriptionState.
From KIP-478, implement the new StreamBuilder#addGlobalStore() overload
that takes a stateUpdateSupplier fully typed Processor<KIn, VIn, Void, Void>.
Where necessary, use the adapters to make the old APIs defer to the new ones,
as well as limiting the scope of this change set.
Reviewers: Boyang Chen <boyang@apache.org>
* KAFKA-10407: Have KafkaLog4jAppender `batch.size` and `linger.ms`
https://issues.apache.org/jira/browse/KAFKA-10407
Currently, KafkaLog4jAppender does not support `batch.size` or `linger.ms` which would otherwise be beneficial in some situations.
ducktape diff: https://github.com/confluentinc/ducktape/compare/v0.7.8...v0.7.9
- bcrypt (a dependency of ducktape) dropped Python2.7 support.
ducktape-0.7.9 now pins bcrypt to a Python2.7-supported version.
Author: Andrew Egelhofer <aegelhofer@confluent.io>
Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#9192 from andrewegel/trunk
Implements the part of KIP-612 that adds broker configurations for broker-wide and per-listener connection creation rate limits and enforces these limits.
Reviewers: David Jacot <djacot@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
Addition of configs for custom topic creation with KIP-158 created a regression when transformation configs are also included in the configuration of a source connector.
To experience the issue, just enabling topic creation at the worker is not sufficient. A user needs to supply a source connector configuration that contains both transformations and custom topic creation properties.
The issue is that the enrichment of configs in `SourceConnectorConfig` happens on top of an `AbstractConfig` rather than a `ConnectorConfig`. Inheriting from the latter allows enrichment to be composable for both topic creation and transformations.
Unit tests and integration tests are written to test these combinations.
Reviewers: Randall Hauch <rhauch@gmail.com>
The main goal is to remove usage of embedded broker (EmbeddedKafkaCluster) in AbstractJoinIntegrationTest and its subclasses.
This is because the tests under this class are no longer using the embedded broker, except for two.
testShouldAutoShutdownOnIncompleteMetadata is one of such tests.
Furthermore, this test does not actually perfom stream-table join; it is testing an edge case of joining with a non-existent topic, so it should be in a separate test.
Testing strategy: run existing unit and integration test
Reviewers: Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.org>
Refactor the RocksDB store and the metrics infrastructure in Streams
in preparation of the recordings of the RocksDB properties specified in KIP-607.
The refactoring includes:
* wrapper around BlockedBasedTableConfig to make the cache accessible to the
RocksDB metrics recorder
* RocksDB metrics recorder now takes also the DB instance and the cache in addition
to the statistics
* The value providers for the metrics are added to the RockDB metrics recorder also if
the recording level is INFO.
* The creation of the RocksDB metrics recording trigger is moved to StreamsMetricsImpl
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <vvcephei@apache.org>
This patch fixes the generated serde logic for the 'records' type so that it uses the compact byte array representation consistently when flexible versions are enabled.
Reviewers: David Arthur <mumrah@gmail.com>
In order to do this, I also removed the optimization such that once enforced checkpoint is set to true, we always checkpoint unless the state stores are not initialized at all (i.e. the snapshot is null).
Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Add a separate error code as PRODUCER_FENCED to differentiate INVALID_PRODUCER_EPOCH. On broker side, replace INVALID_PRODUCER_EPOCH with PRODUCER_FENCED when the request version is the latest, while still returning INVALID_PRODUCER_EPOCH to older clients. On client side, simply handling INVALID_PRODUCER_EPOCH the same as PRODUCER_FENCED if from txn coordinator APIs.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
The message generator was missing conversion logic for tagged structures. This led to casting errors when either `fromStruct` or `toStruct` were invoked. This patch also adds missing null checks in the serialization of tagged byte arrays, which was found from improved test coverage.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
In Kafka Streams the source-of-truth of a state store is in its changelog, therefore when committing a state store we only need to make sure its changelog records are all flushed and committed, but we do not actually need to make sure that the materialized state have to be flushed and persisted since they can always be restored from changelog when necessary.
On the other hand, flushing a state store too frequently may have side effects, e.g. rocksDB flushing would gets the memtable into an L0 sstable, leaving many small L0 files to be compacted later, which introduces larger overhead.
Therefore this PR decouples flushing from committing, such that we do not always flush the state store upon committing, but only when sufficient data has been written since last time flushed. The checkpoint file would then also be overwritten only along with flushing the state store indicating its current known snapshot. This is okay since: a) if EOS is not enabled, then it is fine if the local persisted state is actually ahead of the checkpoint, b) if EOS is enabled, then we would never write a checkpoint file until close.
Here's a more detailed change list of this PR:
1. Do not always flush state stores when calling pre-commit; move stateMgr.flush into post-commit to couple together with checkpointing.
2. In post-commit, we checkpoint when: a) The state store's snapshot has progressed much further compared to the previous checkpoint, b) When the task is being closed, in which case we enforce checkpointing.
3. There are some tricky obstacles that I'd have to work around in a bit hacky way: for cache / suppression buffer, we still need to flush them in pre-commit to make sure all records sent via producers, while the underlying state store should not be flushed. I've decided to introduce a new API in CachingStateStore to be triggered in pre-commit.
I've also made some minor changes piggy-backed in this PR:
4. Do not delete checkpoint file upon loading it, and as a result simplify the checkpointNeeded logic, initializing the snapshotLastFlush to the loaded offsets.
5. In closing, also follow the commit -> suspend -> close ordering as in revocation / assignment.
6. If enforceCheckpoint == true during RUNNING, still calls maybeCheckpoint even with EOS since that is the case for suspending / closing.
Reviewers: John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This patch removes the PartitionHeader grouping from the Fetch response. With old versions of the protocol, there was no cost for this grouping, but once we add flexible version support, then it adds an extra byte to the schema for tagged fields with little apparent benefit.
Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>
Update `CogroupedStreamAggregateBuilder` to have individual builders depending
on the windowed aggregation, or lack thereof. This replaced passing in all options
into the builder, with all but the current type of aggregation set to null and then
checking to see which value was not null.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
This patch ensures we use a force resolution strategy for the scala-library dependency.
I've tested this locally and saw a difference in the output.
With the change (using 2.4 and the jackson library 2.10.5):
```
./core/build/dependant-libs-2.12.10/scala-java8-compat_2.12-0.9.0.jar
./core/build/dependant-libs-2.12.10/scala-collection-compat_2.12-2.1.2.jar
./core/build/dependant-libs-2.12.10/scala-reflect-2.12.10.jar
./core/build/dependant-libs-2.12.10/scala-logging_2.12-3.9.2.jar
./core/build/dependant-libs-2.12.10/scala-library-2.12.10.jar
```
Without (using 2.4 and the jackson library 2.10.5):
```
find . -name 'scala*.jar'
./core/build/dependant-libs-2.12.10/scala-java8-compat_2.12-0.9.0.jar
./core/build/dependant-libs-2.12.10/scala-collection-compat_2.12-2.1.2.jar
./core/build/dependant-libs-2.12.10/scala-reflect-2.12.10.jar
./core/build/dependant-libs-2.12.10/scala-logging_2.12-3.9.2.jar
./core/build/dependant-libs-2.12.10/scala-library-2.12.12.jar
```
Reviewers: Ismael Juma <ismael@juma.me.uk>
Adds the new Processor and ProcessorContext interfaces
as proposed in KIP-478. To integrate in a staged fashion
with the code base, adapters are included to convert back
and forth between the new and old APIs.
ProcessorNode is converted to the new APIs.
Reviewers: Boyang Chen <boyang@confluent.io>
The patch https://github.com/apache/kafka/pull/8672 introduced a bug leading to crashing the replica fetcher threads. The issue is that https://github.com/apache/kafka/pull/8672 deletes the Partitions prior to stopping the replica fetchers. As the replica fetchers relies access the Partition in the ReplicaManager, they crash with a NotLeaderOrFollowerException that is not handled.
This PR reverts the code to the original ordering to avoid this issue.
The regression was caught and validated by our system test: `kafkatest.tests.core.reassign_partitions_test`.
Reviewers: Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>
This PR improves the logging for segment deletion to ensure that a reason is logged for segment deletions via all code paths. It also updates the logging so that we log a reason for an entire batch of deletions instead of logging one message per segment in cases when segment-level details are not significant.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Kowshik Prakasam <kprakasam@confluent.io>, Jason Gustafson <jason@confluent.io>
JIRA: https://issues.apache.org/jira/browse/KAFKA-10193
* add `preempt(): Unit` method for all `ControllerEvent` so that all events (and future events) must implement it
* for events that have callbacks, move the preemption from individual methods to `preempt()`
* add preemption for `ApiPartitionReassignment` and `ListPartitionReassignments`
* add integration tests:
1. test whether `preempt()` is called when controller shuts down
2. test whether the events with callbacks have the correct error response (`NOT_CONTROLLER`)
* explicit typing for `ControllerEvent` methods
Author: jeff kim <jeff.kim@confluent.io>
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>,Stanislav Kozlovski <stanislav@confluent.io>, David Arthur <mumrah@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#9050 from jeffkbkim/KAFKA-10193-controller-events-add-preemption
- part of KIP-572
- replace `retries` in InternalTopicManager with infinite retires plus a new timeout, based on consumer config MAX_POLL_INTERVAL_MS
Reviewers: David Jacot <djacot@confluent.io>, Boyang Chen <boyang@confluent.io>
- implements KIP-648
- Deprecated the existing getters and added new getters without `get` prefix to `KeyQueryMetadata`
Co-authored-by: johnthotekat <Iabon1989*>
Reviewers: Navinder Pal Singh Brar <navinder_brar@yahoo.com>, Matthias J. Sax <matthias@confluent.io>
Based on the discussion in #9072, I have put together an alternative way. This one does the following:
Instead of changing the implementation of the Rate to behave like a Token Bucket, it actually use two different metrics: the regular Rate and a new Token Bucket. The latter is used to enforce the quota.
The Token Bucket algorithm uses the rate of the quota as the refill rate for the credits and compute the burst based on the number of samples and their length (# samples * sample length * quota).
The Token Bucket algorithm used can go under zero in order to handle unlimited burst (e.g. create topic with a number of partitions higher than the burst). Throttling kicks in when the number of credits is under zero.
The throttle time is computed as credits under zero / refill rate (or quota).
Only the controller mutation uses it for now.
The remaining number of credits in the bucket is exposed with the tokens metrics per user/clientId.
Reviewers: Anna Povzner <anna@confluent.io>, Jun Rao <junrao@gmail.com>
- part of KIP-572
- removed the usage of `retries` in `GlobalStateManger`
- instead of retries the new `task.timeout.ms` config is used
Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
- replace System.exit with Exit.exit in all relevant classes
- forbid use of System.exit in all relevant classes and add exceptions for others
Co-authored-by: John Roesler <vvcephei@apache.org>
Co-authored-by: Matthias J. Sax <matthias@confluent.io>
Reviewers: Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@confluent.io>
Creating a topic may fail (due to timeout) in running system tests. However, `RoundTripWorker` does not ignore `TopicExistsException` which makes `round_trip_fault_test.py` be a flaky one.
More specifically, a network exception can cause the `CreateTopics` request to reach Kafka but Trogdor retry it
and hit a `TopicAlreadyExists` exception on the retry, failing the test.
Reviewers: Ismael Juma <ismael@juma.me.uk>