Sender/RecordAccumulator never resets the next batch expiry time. Its always computed as the min of the current value and the expiry time for all batches being processed. This means that its always set to the expiry time of the first batch, and once that time has passed Sender starts spinning on epoll with a timeout of 0, which consumes a lot of CPU. This patch updates Sender to reset the next batch expiry time on each poll loop so that a new value reflecting the expiry time for the current set of batches is computed.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
We should check TxnOffsetCommit responses for the COORDINATOR_LOADING_IN_PROGRESS error code and retry if we see it. Additionally, if we encounter an abortable error, we need to ensure that pending transaction offset commits are cleared.
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Ensure that sends are completed before waiting for channel to be closed based on idle expiry, since channel will not be expired if added to ready keys in the next poll as a result of pending sends.
Reviewers: Jun Rao <junrao@gmail.com>
Use a volatile field to track the size of the set of assigned partitions to avoid the concurrent access to the underlying linked hash map.
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting. This patch adds a new `Timer` class to simplify this logic and control unnecessary calls to system time. In particular, this helps with nested timeout operations. The consumer has been updated to use the new class.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Currently Fetcher.getTopicMetadata() will not include offline partitions. Thus
KafkaConsumer.partitionsFor(topic) will not return all partitions of a topic if
there if any partition of the topic is offline. This causes problem if user
tries to query the total number of partitions of the given topic.
Author: radai-rosenblatt <radai.rosenblatt@gmail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#4679 from radai-rosenblatt/partition_shenanigans
Fixed incorrect use of default timeout instead of the argument explicitly passed to `newClientRequest`.
Reviewers: Ron Dagostino <rndgstn@gmail.com>, Ismael Juma <ismael@juma.me.uk>
This patch forces metadata update for consumers with pattern subscription at the beginning of rebalance (retry.backoff.ms is respected). This is to prevent such consumers from detecting subscription changes (e.g., new topic creation) independently and triggering multiple unnecessary rebalances. KAFKA-7126 contains detailed scenarios and rationale.
Author: Jon Lee <jonlee@linkedin.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Dong Lin <lindong28@gmail.com>
Closes#5408 from jonlee2/KAFKA-7126
An untimely wakeup can cause ConsumerCoordinator.onJoinComplete to throw a WakeupException before completion. On the next poll(), it will be retried, but this leads to an underflow error because the buffer containing the assignment data will already have been advanced. The solution is to duplicate the buffer passed to onJoinComplete.
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
After successful completion of KafkaProducer#close, it is possible that an application calls KafkaProducer#send. If the send is invoked for a topic for which we do not have any metadata, the producer will block until `max.block.ms` elapses - we do not expect to receive any metadata update in this case because Sender (and NetworkClient) has already exited. It is only when RecordAccumulator#append is invoked that we notice that the producer has already been closed and throw an exception. If `max.block.ms` is set to Long.MaxValue (or a sufficiently high value in general), the producer could block awaiting metadata indefinitely.
This patch makes sure `Metadata#awaitUpdate` periodically checks if the network client has been closed, and if so bails out as soon as possible.
The SASL/OAUTHBEARER client response as currently implemented in OAuthBearerSaslClient sends the valid gs2-header "n,," but then sends the "auth" key and value immediately after it.
This does not conform to the specification because there is no %x01 after the gs2-header, no %x01 after the auth value, and no terminating %x01. Fixed this and the parsing of the client response in
OAuthBearerSaslServer, which currently allows the malformed text. Also updated to accept and ignore unknown properties as required by the spec.
Reviewers: Stanislav Kozlovski <familyguyuser192@windowslive.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
SSL `close_notify` from broker connection close was processed as a handshake failure in clients while unwrapping the message if a handshake is in progress. Updated to handle this as a retriable IOException rather than a non-retriable SslAuthenticationException to avoid authentication exceptions in clients during rolling restart of brokers.
Reviewers: Ismael Juma <ismael@juma.me.uk>
…egal char and generates InvalidTopicException
If config parameter max.block.ms config parameter is set to a non-zero value,
KafkaProducer.send() blocks for the max.block.ms time if topic name has illegal
char or is invalid.
Wrote a unit test that verifies the appropriate exception is returned when
performing a get on the returned future by KafkaProducer.send().
Author: Ahmed Al Mehdi <aalmehdi@aalmehdi-ld1.linkedin.biz>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Manikumar Reddy O <manikumar.reddy@gmail.com>
Closes#5247 from ahmedha/KAFKA-5098
Add some additional size validation to prevent overflows when using `FileRecords`.
Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>
We need to use the same lock for metric update and read to avoid NPE and concurrent modification exceptions. Sensor add/remove/update are synchronized on Sensor since they access lists and maps that are not thread-safe. Reporters are notified of metrics add/remove while holding (Sensor, Metrics) locks and reporters may synchronize on the reporter lock. Metric read may be invoked by metrics reporters while holding a reporter lock. So read/update cannot be synchronized using Sensor since that could lead to deadlock. This PR introduces a new lock in Sensor for update/read.
Locking order:
- Sensor#add: Sensor -> Metrics -> MetricsReporter
- Metrics#removeSensor: Sensor -> Metrics -> MetricsReporter
- KafkaMetric#metricValue: MetricsReporter -> Sensor#metricLock
- Sensor#record: Sensor -> Sensor#metricLock
Reviewers: Jun Rao <junrao@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
KAFKA-6986:Export Admin Client metrics through Stream Threads
We already exported producer and consumer metrics through KafkaStreams class:
#4998
It makes sense to also export the Admin client metrics.
I didn't add a separate unittest case for this. Let me know if it's needed.
This is my first contribution, feel free to point out any mistakes that I did.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
We need to ensure that the last poll time is always updated when the user call poll(Duration). This patch fixes a bug in the new KIP-266 timeout behavior which would cause this to be skipped if the coordinator could not be found while the consumer was in an active group.
Note that I've also fixed some type inconsistencies for various timeouts.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Previously, the connection-creation metric only accounted for opened connections from the broker. This change extends it to account for received connections.
They rely on finalizers (before Java 11), which create
unnecessary GC load. The alternatives are as easy to
use and don't have this issue.
Also use FileChannel directly instead of retrieving
it from RandomAccessFile whenever possible
since the indirection is unnecessary.
Finally, add a few try/finally blocks.
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>
Leftover threads doing network I/O can interfere with subsequent tests. Add missing shutdown in tests and include admin client in the check for leftover threads.
Reviewers: Anna Povzner <anna@confluent.io>, Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy O <manikumar.reddy@gmail.com>
This patch contains the improved offset expiration semantics proposed in KIP-211. Committed offsets will not be expired as long as a group is active. Once all members have left the group, then offsets will be expired after the timeout configured by `offsets.retention.minutes`. Note that the optimization for early expiration of unsubscribed topics will be implemented in a separate patch.
This patch fixes the following issues in the log splitting logic added to address KAFKA-6264:
1. We were not handling the case when all messages in the segment overflowed the index. In this case, there is only one resulting segment following the split.
2. There was an off-by-one error in the recovery logic when completing a swap operation which caused an unintended segment deletion.
Additionally, this patch factors out of `splitOverflowedSegment` a method to write to a segment using from with an instance of `FileRecords`. This allows for future reuse and isolated testing.
Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Avoid unnecessary processing of SSL channels when there are some bytes buffered, but not enough to make progress.
Reviewers: Radai Rosenblatt <radai.rosenblatt@gmail.com>, Jun Rao <junrao@gmail.com>
We might decide to drop certain message batches during down-conversion because older clients might not be able to interpret them. One such example is control batches which are typically removed by the broker if down-conversion to V0 or V1 is required. This patch makes sure the chunked down-conversion implementation is able to handle such cases.
The initial PR for KIP-290 #5117 added a new `ResourceNameType`, which was initially a field on `Resource` and `ResourceFilter`. However, follow on PRs have now moved the name type fields to new `ResourcePattern` and `ResourcePatternFilter` classes. This means the old name is no longer valid and may be confusing. The PR looks to rename the class to a more intuitive `resource.PatternType`.
@cmccabe also requested that the current `ANY` value for this class be renamed to avoid confusion. `PatternType.ANY` currently causes `ResourcePatternFilter` to bring back all ACLs that would affect the supplied resource, i.e. it brings back literal, wildcard ACLs, and also does pattern matching to work out which prefix acls would affect the resource. This is very different from the behaviour of `ResourceType.ANY`, which just means the filter ignores the type of resources.
`ANY` is to be renamed to `MATCH` to disambiguate it from other `ANY` filter types. A new `ANY` will be added that works in the same way as others, i.e. it will cause the filter to ignore the pattern type, (but won't do any pattern matching).
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
This patch changes the default `request.timeout.ms` of the consumer to 30 seconds. Additionally, it adds logic to `NetworkClient` and related to components to support timeouts at the request level. We use this to handle the special case of the JoinGroup request, which may block for as long as the value configured by `max.poll.interval.ms`.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <guozhang@confluent.io>
Adds a configuration that specifies the default timeout for KafkaConsumer APIs that could block. This was introduced in KIP-266.
Reviewers: Satish Duggana <satish.duggana@gmail.com>, Jason Gustafson <jason@confluent.io>
This moves FileConfigProvider to the org.apache.common.config.provider package to more easily isolate provider implementations going forward.
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
It takes O(n^2) time to instantiate a mbean with n attributes which can be very slow if the number of attributes of this mbean is large. This PR removes metrics whose number of attributes can grow with the number of partitions in the cluster to fix the performance issue. These metrics have already been marked for removal in 2.0 by KIP-225.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#5172 from lindong28/remove-deprecated-metrics
This is a follow-on change requested as part of the initial PR for KIP-290 #5117. @cmccabe requested that the `resource.Resource` class be factored out in favour of `ConfigResource` to avoid confusion between all the `Resource` implementations.
Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
This patch adds logic to detect and fix segments which have overflowed offsets as a result of bugs in older versions of Kafka.
Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>
remove duplicate Scala ResourceNameType in preference to in preference to Java ResourceNameType.
This is follow on work for KIP-290 and PR #5117, which saw the Scala ResourceNameType class introduced.
I've added tests to ensure AclBindings can't be created with ResourceNameType.ANY or UNKNOWN.
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
The initial PR for KIP-290 #5117 added a `ResourceNameType` field to the Java and Scala `Resource` classes to introduce the concept of Prefixed ACLS. This does not make a lot of sense as these classes are meant to represent cluster resources, which would not have a concept of 'name type'. This work has not been released yet, so we have time to change it.
This PR looks to refactor the code to remove the name type field from the Java `Resource` class. (The Scala one will age out once KIP-290 is done, and removing it would involve changes to the `Authorizer` interface, so this class was not touched).
This is achieved by replacing the use of `Resource` with `ResourcePattern` and `ResourceFilter` with `ResourceFilterPattern`. A `ResourcePattern` is a combination of resource type, name and name type, where each field needs to be defined. A `ResourcePatternFilter` is used to select patterns during describe and delete operations.
The adminClient uses `AclBinding` and `AclBindingFilter`. These types have been switched over to use the new pattern types.
The AclCommands class, used by Kafka-acls.sh, has been converted to use the new pattern types.
The result is that the original `Resource` and `ResourceFilter` classes are not really used anywhere, except deprecated methods. However, the `Resource` class will be used if/when KIP-50 is done.
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Jun Rao <junrao@gmail.com>
Co-authored-by: Piyush Vijay <pvijay@apple.com>
Co-authored-by: Andy Coates <big-andy-coates@users.noreply.github.com>
PrincipalBuilder implementations can now take the listener into account
when creating the Principal. This is especially interesting in deployments
where inter-broker traffic is on a different listener than client traffic or
when the same protocol is used by multiple listeners.
The change in itself is mostly "plumbing" as the listener name needs to be
passed from ChannelBuilders all the way down to all classes implementing
AuthenticationContext.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
We added logic to reassign nodes in callToSend after a connection failure, but we do not handle the case when there is no node currently available to reassign the request to. This can happen when using MetadataUpdateNodeIdProvider if all of the known nodes are blacked out awaiting the retry backoff. To fix this, we need to ensure that the call is added to pendingCalls if a new node cannot be found.