This PR fixes two issues that have been introduced by #9114.
- When the metric was switched from Rate to TokenBucket in the ControllerMutationQuotaManager, the metrics were mixed up. That broke the quota update path.
- When a quota is updated, the ClientQuotaManager updates the MetricConfig of the KafkaMetric. That update was not reflected into the Sensor so the Sensor was still using the MetricConfig that it has been created with.
Reviewers: Anna Povzner <anna@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
This change sets the groundwork for migrating other modules incrementally.
Main changes:
- Replace `junit` 4.13 with `junit-jupiter` and `junit-vintage` 5.7.0-RC1.
- All modules except for `tools` depend on `junit-vintage`.
- `tools` depends on `junit-jupiter`.
- Convert `tools` tests to JUnit 5.
- Update `PushHttpMetricsReporterTest` to use `mockito` instead of `powermock` and `easymock`
(powermock doesn't seem to work well with JUnit 5 and we don't need it since mockito can mock
static methods).
- Update `mockito` to 3.5.7.
- Update `TestUtils` to use JUnit 5 assertions since `tools` depends on it.
Unrelated clean-ups:
- Remove `unit` from package names in a few `core` tests.
- Replace `try/catch/fail` with `assertThrows` in a number of places.
- Tag `CoordinatorTest` as integration test.
- Remove unnecessary type parameters when invoking methods and constructors.
Tested with IntelliJ and gradle. Verified that the following commands work as expected:
* ./gradlew tools:unitTest
* ./gradlew tools:integrationTest
* ./gradlew tools:test
* ./gradlew core:unitTest
* ./gradlew core:integrationTest
* ./gradlew clients:test
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
1. Split the consumer coordinator's REBALANCING state into PREPARING_REBALANCE and COMPLETING_REBALANCE. The first is when the join group request is sent, and the second is after the join group response is received. During the first state we should still not send hb since it shares the same socket with the join group request and the group coordinator has disabled timeout, however when we transit to the second state we should start sending hb in case leader's assign takes long time. This is also for fixing KAFKA-10122.
2. When deciding coordinator#timeToNextPoll, do not count in timeToNextHeartbeat if the state is in UNJOINED or PREPARING_REBALANCE since we would disable hb and hence its timer would not be updated.
3. On the broker side, allow hb received during PREPARING_REBALANCE, return NONE error code instead of REBALANCE_IN_PROGRESS. However on client side, we still need to ignore REBALANCE_IN_PROGRESS if state is COMPLETING_REBALANCE in case it is talking to an old versioned broker.
4. Piggy-backing a log4j improvement on the broker coordinator for triggering rebalance reason, as I found it a bit blurred during the investigation. Also subsumed #9038 with log4j improvements.
The tricky part for allowing hb during COMPLETING_REBALANCE is in two parts: 1) before the sync-group response is received, a hb response may have reset the generation; also after the sync-group response but before the callback is triggered, a hb response can still reset the generation, we need to handle both cases by checking the generation / state. 2) with the hb thread enabled, the sync-group request may be sent by the hb thread even if the caller thread did not call poll yet.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <john@confluent.io>
Fixes flakiness in `KafkaAdminClientTest` as a result of #8864. Addresses the following flaky tests:
- testAlterReplicaLogDirsPartialFailure
- testDescribeLogDirsPartialFailure
- testMetadataRetries
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
The schema specification allows a struct type name to differ from the field name. This works with the generated `Message` classes, but not with the generated JSON converter. The patch fixes the problem, which is that the type name is getting replaced with the field name when the struct is registered in the `StructRegistry`.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
Add ImplicitLinkedHashCollection#moveToEnd.
Refactor ImplicitLinkedHashCollectionIterator to be a little bit more
robust against concurrent modifications to the map (which admittedly
should not happen.)
Reviewers: Jason Gustafson <jason@confluent.io>
Implement the KIP-554 API to create, describe, and alter SCRAM user configurations via the AdminClient. Add ducktape tests, and modify JUnit tests to test and use the new API where appropriate.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>
For the generated message code, put the JSON conversion functionality
in a separate JsonConverter class.
Make MessageDataGenerator simply another generator class, alongside the
new JsonConverterGenerator class. Move some of the utility functions
from MessageDataGenerator into FieldSpec and other places, so that they
can be used by other generator classes.
Use argparse4j to support a better command-line for the generator.
Reviewers: David Arthur <mumrah@gmail.com>
Adds avg, min, and max e2e latency metrics at the new TRACE level. Also adds the missing avg task-level metric at the INFO level.
I think where we left off with the KIP, the TRACE-level metrics were still defined to be "stateful-processor-level". I realized this doesn't really make sense and would be pretty much impossible to define given the DFS processing approach of Streams, and felt that store-level metrics made more sense to begin with. I haven't updated the KIP yet so I could get some initial feedback on this
Reviewers: Bruno Cadonna <bruno@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Only check if positions need validation if there is new metadata.
Also fix some inefficient java.util.stream code in the hot path of SubscriptionState.
This patch fixes the generated serde logic for the 'records' type so that it uses the compact byte array representation consistently when flexible versions are enabled.
Reviewers: David Arthur <mumrah@gmail.com>
Add a separate error code as PRODUCER_FENCED to differentiate INVALID_PRODUCER_EPOCH. On broker side, replace INVALID_PRODUCER_EPOCH with PRODUCER_FENCED when the request version is the latest, while still returning INVALID_PRODUCER_EPOCH to older clients. On client side, simply handling INVALID_PRODUCER_EPOCH the same as PRODUCER_FENCED if from txn coordinator APIs.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
The message generator was missing conversion logic for tagged structures. This led to casting errors when either `fromStruct` or `toStruct` were invoked. This patch also adds missing null checks in the serialization of tagged byte arrays, which was found from improved test coverage.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
This patch removes the PartitionHeader grouping from the Fetch response. With old versions of the protocol, there was no cost for this grouping, but once we add flexible version support, then it adds an extra byte to the schema for tagged fields with little apparent benefit.
Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>
Based on the discussion in #9072, I have put together an alternative way. This one does the following:
Instead of changing the implementation of the Rate to behave like a Token Bucket, it actually use two different metrics: the regular Rate and a new Token Bucket. The latter is used to enforce the quota.
The Token Bucket algorithm uses the rate of the quota as the refill rate for the credits and compute the burst based on the number of samples and their length (# samples * sample length * quota).
The Token Bucket algorithm used can go under zero in order to handle unlimited burst (e.g. create topic with a number of partitions higher than the burst). Throttling kicks in when the number of credits is under zero.
The throttle time is computed as credits under zero / refill rate (or quota).
Only the controller mutation uses it for now.
The remaining number of credits in the bucket is exposed with the tokens metrics per user/clientId.
Reviewers: Anna Povzner <anna@confluent.io>, Jun Rao <junrao@gmail.com>
Enhance the understandability for constrainedAssign and generalAssign method by getting more detailed meta comments.
Co-authored-by: A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Refactored FetchRequest and FetchResponse to use the generated message classes for serialization and deserialization. This allows us to bypass unnecessary Struct conversion in a few places. A new "records" type was added to the message protocol which uses BaseRecords as the field type. When sending, we can set a FileRecords instance on the message, and when receiving the message class will use MemoryRecords.
Also included a few JMH benchmarks which indicate a small performance improvement for requests with high partition counts or small record sizes.
Reviewers: Jason Gustafson <jason@confluent.io>, Boyang Chen <boyang@confluent.io>, David Jacot <djacot@confluent.io>, Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <djacot@confluent.io>, Lee Dongjin <dongjin@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
Modified KafkaProducer.sendOffsetsToTransaction() to be affected with max.block.ms, and added timeout test for blocking methods
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Xi Hu <huxi_2b@hotmail.com>
Add a broker to controller channel manager for use cases such as redirection and AlterIsr.
Reviewers: David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Viktor Somogyi <viktorsomogyi@gmail.com>
Co-authored-by: Boyang Chen <boyang@confluent.io>
While debugging a rebalance scenario I found that inside rejoinNeededOrPending when we trigger rebalance due to metadata or subscription changes it is not logged, and hence it's actually a bit tricky to find out the reason of the triggered rebalance. I'm adding two INFO log4j entries to fill in the gap.
Other requestRejoin() calls are already covered.
Reviewers: Boyang Chen <boyang@confluent.io>
This PR implements the broker side changes of KIP-599, except the changes of the Rate implementation which will be addressed separately. The PR changes/introduces the following:
- It introduces the protocol changes.
- It introduces a new quota manager ControllerMutationQuotaManager which is another specialization of the ClientQuotaManager.
- It enforces the quota in the KafkaApis and in the AdminManager. This part handles new and old clients as described in the KIP.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
- part of KIP-572
- deprecates producer config `retries` (still in use)
- deprecates admin config `retries` (still in use)
- deprecates Kafka Streams config `retries` (will be ignored)
- adds new Kafka Streams config `task.timeout.ms` (follow up PRs will leverage this new config)
Reviewers: John Roesler <john@confluent.io>, Jason Gustafson <jason@confluent.io>, Randall Hauch <randall@confluent.io>
- After #8312, older brokers are returning empty configs, with latest `adminClient.describeConfigs`. Old brokers are receiving empty configNames in `AdminManageer.describeConfigs()` method. Older brokers does not handle empty configKeys. Due to this old brokers are filtering all the configs.
- Update ClientCompatibilityTest to verify describe configs
- Add test case to test describe configs with empty configuration Keys
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#9046 from omkreddy/KAFKA-9432
Brokers currently return NOT_LEADER_FOR_PARTITION to producers and REPLICA_NOT_AVAILABLE to consumers if a replica is not available on the broker during reassignments. Non-Java clients treat REPLICA_NOT_AVAILABLE as a non-retriable exception, Java consumers handle this error by explicitly matching the error code even though it is not an InvalidMetadataException. This PR renames NOT_LEADER_FOR_PARTITION to NOT_LEADER_OR_FOLLOWER and uses the same error for producers and consumers. This is compatible with both Java and non-Java clients since all clients handle this error code (6) as retriable exception. The PR also makes ReplicaNotAvailableException a subclass of InvalidMetadataException.
- ALTER_REPLICA_LOG_DIRS continues to return REPLICA_NOT_AVAILABLE. Retained this for compatibility since this request never returned NOT_LEADER_FOR_PARTITION earlier.
- MetadataRequest version 0 also returns REPLICA_NOT_AVAILABLE as topic-level error code for compatibility. Newer versions filter these out and return Errors.NONE, so didn't change this.
- Partition responses in MetadataRequest return REPLICA_NOT_AVAILABLE to indicate that one of the replicas is not available. Did not change this since NOT_LEADER_FOR_PARTITION is not suitable in this case.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Bob Barrett <bob.barrett@confluent.io>
The intention of using poll(0) is to not block on rebalance but still return some data; however, `updateAssignmentMetadataIfNeeded` have three different logic: 1) discover coordinator if necessary, 2) join-group if necessary, 3) refresh metadata and fetch position if necessary. We only want to make 2) to be non-blocking but not others, since e.g. when the coordinator is down, then heartbeat would expire and cause the consumer to fetch with timeout 0 as well, causing unnecessarily high CPU.
Since splitting this function is a rather big change to make as a last minute blocker fix for 2.6, so I made a smaller change to make updateAssignmentMetadataIfNeeded has an optional boolean flag to indicate if 2) above should wait until either expired or complete, otherwise do not wait on the join-group future and just poll with zero timer.
Reviewers: Jason Gustafson <jason@confluent.io>
This PR fixes a bug introduced in #8683.
While processing connection set up timeouts, we are iterating through the connecting nodes to process timeouts and we disconnect within the loop, removing the entry from the set in the loop that it iterating over the set. That raises a ConcurrentModificationException exception. The current unit test did not catch this because it was using only one node.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
The documentation for max.block.ms said it affected only send()
and partitionsFor(), but it actually also affects initTransactions(),
abortTransaction() and commitTransaction(). So rework the
documentation to cover these methods too.
Reviewers: Boyang Chen <boyang@confluent.io>
This PR includes 3 MessageFormatters for MirrorMaker2 internal topics:
- HeartbeatFormatter
- CheckpointFormatter
- OffsetSyncFormatter
This also introduces a new public interface org.apache.kafka.common.MessageFormatter that users can implement to build custom formatters.
Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>, David Jacot <djacot@confluent.io>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>