After a recent leader election, the leaders high-water mark might lag behind the offset at the beginning of the new epoch (as well as the previous leader's HW). This can lead to offsets going backwards from a client perspective, which is confusing and leads to strange behavior in some clients.
This change causes Partition#fetchOffsetForTimestamp to throw an exception to indicate the offsets are not yet available from the leader. For new clients, a new OFFSET_NOT_AVAILABLE error is added. For existing clients, a LEADER_NOT_AVAILABLE is thrown.
This is an implementation of [KIP-207](https://cwiki.apache.org/confluence/display/KAFKA/KIP-207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+monotonically+increasing+even+during+a+partition+leader+change).
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Dhruvil Shah <dhruvil@confluent.io>, Jason Gustafson <jason@confluent.io>
Ensure that channel and selection keys are removed from `Selector` collections before propagating connect exceptions. They are currently cleared on the next `poll()`, but we can't ensure that callers (NetworkClient for example) wont try to connect again before the next `poll` and hence we should clear the collections before re-throwing exceptions from `connect()`.
Reviewers: Jason Gustafson <jason@confluent.io>
The example in the producer's javadoc contained an inconsistent value for `delivery.timeout.ms`. This patch removes the inconsistent config and several unnecessary overrides in order to simplify the example.
Reviewers: huxi <huxi_2b@hotmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Older versions of the Produce API should return an error if zstd is used. This validation existed, but it was done during request parsing, which means that instead of returning an error code, the broker disconnected. This patch fixes the issue by moving the validation outside of the parsing logic. It also fixes several other record validations which had the same problem.
Reviewers: Jason Gustafson <jason@confluent.io>
This attempts to address the flaky test `SaslAuthenticatorTest.testCannotReauthenticateWithDifferentPrincipal()`
I was not able to reproduce locally even after 150 test runs in a loop, but given the error message:
```
org.junit.ComparisonFailure: expected:
<[6QBJiMZ6o5AqbNAjDTDjWtQSa4alfuUWsYKIy2tt7dz5heDaWZlz21yr8Gl4uEJkQABQXeEL0UebdpufDb5k8SvReSK6wYwQ9huP-9]> but was:<[????����????OAUTHBEARER]>
```
`????����????` seems to mean invalid UTF-8.
We now specify the charset when writing out and reading in bytes.
Reviewers: Ismael Juma <ismael@juma.me.uk>
A heap dump provided by Patrik Kleindl in https://issues.apache.org/jira/browse/KAFKA-7660 identifies the childrenSensors map in Metrics as keeping references to sensors alive after they have been removed.
This PR fixes it and adds a test to be sure.
Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
While metrics like Min, Avg and Max make sense to respective use Double.MAX_VALUE, 0.0 and Double.MIN_VALUE as default values to ease computation logic, exposing those values makes reading them a bit misleading. For instance, how would you differentiate whether your -avg metric has a value of 0 because it was given samples of 0 or no samples were fed to it?
It makes sense to standardize on the output of these metrics with something that clearly denotes that no values have been recorded.
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
…eturned by poll() if there are any records to return
The MockConsumer behaves unlike the real consumer in that it can return a non-empty ConsumerRecords from poll, that also has a count of 0. This change makes the MockConsumer only add partitions to the ConsumerRecords if there are records to return for those partitions.
A unit test in MockConsumerTest demonstrates the issue.
Author: Stig Rohde Døssing <stigdoessing@gmail.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#5901 from srdo/KAFKA-7616
Improve the default group id behavior by:
* changing the default consumer group to null, where no offset commit or fetch, or group management operations are allowed
* deprecating the use of empty (`""`) consumer group on the client
Reviewers: Jason Gustafson <jason@confluent.io>
ReplicaFetcherThread.shutdown attempts to close the fetcher's Selector while the thread is running. This in unsafe and can result in `Selector.close()` failing with an exception. The exception is caught and logged at debug level, but this can lead to socket leak if the shutdown is due to dynamic config update rather than broker shutdown. This PR changes the shutdown logic to close Selector after the replica fetcher thread is shutdown, with a wakeup() and flag to terminate blocking sends first.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
- `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
- `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
- `scala.Console` (deprecated in Scala 2.11)
- `PartitionData` taking a timestamp (one of them seemingly a bug)
- `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method
Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
The problem is the concurrent metadata updates in the foreground and in the heartbeat thread. Changed the code to use ConsumerNetworkClient.poll, which enforces mutual exclusion when accessing the underlying client.
As part of KIP-320, the ListOffsets API should return the leader epoch of any fetched offset. We either get this epoch from the log itself for a timestamp query or from the epoch cache if we are searching the earliest or latest offset in the log. When handling queries for the latest offset, we have elected to choose the current leader epoch, which is consistent with other handling (e.g. OffsetsForTimes).
Reviewers: Jun Rao <junrao@gmail.com>
This patch makes two improvements to internal metadata handling logic and testing:
1. It reduce dependence on the public object `Cluster` for internal metadata propagation since it is not easy to evolve. As an example, we need to propagate leader epochs from the metadata response to `Metadata`, but it is not straightforward to do this without exposing it in `PartitionInfo` since that is what `Cluster` uses internally. By doing this change, we are able to remove some redundant `Cluster` building logic.
2. We want to make the metadata handling in `MockClient` simpler and more consistent. Currently we have mix of metadata update mechanisms which are internally inconsistent with each other and do not match the implementation in `NetworkClient`.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
KIP-368 implementation to enable periodic re-authentication of SASL clients. Also adds a broker configuration option to terminate client connections that do not re-authenticate within the configured interval.
Just a doc change
Author: John Eismeier <john.eismeier@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#4573 from jeis2497052/trunk
Decrease the lower bound for expected available memory, as thread
scheduling entails that a variable amount of deallocation happens by
the point of assertion.
Also make minor clarifications to test logic and comments.
The passing rate improved from 98% to 100% locally after these
changes (100+ runs).
Reviewers: Ismael Juma <ismael@juma.me.uk>
### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
Reduce tick interval of the mock timer and avoid large timer increments to avoid hitting idle expiry on the client-side before delayed close is processed by the server. Also reduce poll interval in the server to make the test complete faster (since delayed close is only processed when poll returns).
Reviewers: Ismael Juma <ismael@juma.me.uk>
Add CreatePartitionsRequest.PartitionDetails similar to CreateTopicsRequest.TopicDetails to avoid references from `common.requests` package to `clients`.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Update `ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup` test to work in environments where ipv6 is not enabled and `InetAddress.getAllByName` doesn't return ipv6 addresses.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Adds `client.dns.lookup=resolve_canonical_bootstrap_servers_only` option to perform full dns resolution of bootstrap addresses
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Sriharsha Chintalapani <sriharsha@apache.org>, Edoardo Comar <ecomar@uk.ibm.com>, Mickael Maison <mickael.maison@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Implementation of KIP-302: Based on the new client configuration `client.dns.lookup`, a NetworkClient can use InetAddress.getAllByName to find all IPs and iterate over them when they fail to connect. Only uses either IPv4 or IPv6 addresses similar to the default mode.
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
- SslFactoryTest should use SslFactory to create SSLEngine
- Use Mockito instead of EasyMock in `ConsoleConsumerTest` as one of
the tests mocks a standard library class and the latest released EasyMock
version can't do that when Java 11 is used.
- Avoid mocking `ConcurrentMap` in `SourceTaskOffsetCommitterTest`
for similar reasons. As it happens, mocking is not actually needed here.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
Development of EasyMock and PowerMock has stagnated while Mockito
continues to be actively developed. With the new Java release cadence,
it's a problem to depend on libraries that do bytecode manipulation
and are not actively maintained. In addition, Mockito is also
easier to use.
While updating the tests, I attempted to go from failing test to
passing test. In cases where the updated test passed on the first
attempt, I artificially broke it to ensure the test was still doing its
job.
I included a few improvements that were helpful while making these
changes:
1. Better exception if there are no nodes in `leastLoadedNodes`
2. Always close the producer in `KafkaProducerTest`
3. requestsInFlight producer metric should not hold a reference to
`Sender`
Finally, `Metadata` is no longer final so that we don't need
`PowerMock` to mock it. It's an internal class, so it's OK.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Dong Lin <lindong28@gmail.com>
Closes#5691 from ijuma/kafka-7438-mockito
OAuthBearerLoginModule is used both on the server-side and client-side (similar to login modules for other mechanisms). OAUTHBEARER tokens are client credentials used only on the client-side to authenticate with servers, but the current implementation requires tokens to be provided on the server-side even if OAUTHBEARER is not used for inter-broker communication. This commit makes tokens optional for server-side login context to allow brokers to be configured without a token when OAUTHBEARER is not used for inter-broker communication.
Reviewers: Ron Dagostino <rndgstn@gmail.com>, Jun Rao <junrao@gmail.com>
This patch contains the broker-side support for the fencing improvements from KIP-320. This includes the leader epoch validation in the ListOffsets, OffsetsForLeaderEpoch, and Fetch APIs as well as the changes needed in the fetcher threads to maintain and use the current leader epoch. The client changes from KIP-320 will be left for a follow-up.
One notable change worth mentioning is that we now require the read lock in `Partition` in order to read from the log or to query offsets. This is necessary to ensure the safety of the leader epoch validation. Additionally, we forward all leader epoch changes to the replica fetcher thread and go through the truncation phase. This is needed to ensure the fetcher always has the latest epoch and to guarantee that we cannot miss needed truncation if we missed an epoch change.
Reviewers: Jun Rao <junrao@gmail.com>
This patch adds checks before reading the first record of a control batch. If the batch is empty, it is treated as having already been cleaned. In the case of LogCleaner this means it is safe to discard. In the case of ProducerStateManager it means it shouldn't cause state to be stored because the relevant transaction has already been cleaned. In the case of Fetcher, it just preempts the check for an abort. In the case of GroupMetadataManager, it doesn't process the offset as a commit. The patch also adds isControl to the output of DumpLogSegments. Changes were tested with unit tests, except the DumpLogSegments change which was tested manually.
`ConsumerCoordinator.onJoinPrepare()` currently makes multiple copies of the set of assigned partitions. We can let `subscriptions.assignedPartitions()` return a view of the underlying partition set, copy it only once and re-use the copied value.
Author: radai-rosenblatt <radai.rosenblatt@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Dong Lin <lindong28@gmail.com>
Closes#5124 from radai-rosenblatt/copy-all-the-things
Lazy allocation helps when there are a large number of connections
that have been accepted, but where no data has been received from
the clients. Each buffer is often around 16k (max TLS record size).
Nulling the buffers should not make a difference in the current
implementation since we release the reference to the channel
and transport layer after we close them, but it's a good practice
to release medium/large buffers after `close` is called.
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>