Rewrote the InMemoryWindowStore implementation by moving the work of a fetch to the iterator, and cleaned up the iterators as well.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Refactors the various maps used in TransactionManager into one map to simplify bookkeeping of inflight batches, offsets and sequence numbers.
Reviewers: Jason Gustafson <jason@confluent.io>
Moved hard-coded 'expired-window-record-drop' and 'late-record-drop' to static Strings in StreamsMetricsImpl
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
Stack trace generated from the test failure shows that test failed even though threads were runnable and making progress, indicating that the timeout may be too small when test machine is slow. Increasing timeout from 10 to 15 seconds, consistent with the default wait in other tests. Thread dump also showed a lot of left over threads from other tests, so added clean up of those as well.
Reviewers: Ismael Juma <ismael@juma.me.uk>
SecurityTest.test_client_ssl_endpoint_validation_failure is failing because it greps for 'SSLHandshakeException in the consumer and producer log files. With the fix for KAKFA-7773, the test uses the VerifiableConsumer instead of the ConsoleConsumer, which does not log the exception stack trace to the service log. This patch catches exceptions in the VerifiableConsumer and logs them in order to fix the test. Tested by running the test locally.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
KAFKA-7880: Name worker thread to include task id
Change Connect's `WorkerTask` to name the thread using the `task-thread-<connectorTaskId>` pattern.
Reviewers: Randall Hauch <rhauch@gmail.com>
Port 22 is used by ssh, which causes the AdminClient to throw an OOM:
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
> at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:112)
> at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
> at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
> at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
> at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:535)
> at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1140)
> at java.lang.Thread.run(Thread.java:748)
>
>
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#6360 from omkreddy/KAFKA-7312
This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.
Reviewers: Anna Povzner <anna@confluent.io>, Jason Gustafson <jason@confluent.io>
Third (and final) PR in series to inline the generic parameters of the following bytes stores:
[Pt. I] InMemoryKeyValueStore
[Pt. II] RocksDBWindowStore
[Pt. II] RocksDBSessionStore
[Pt. II] MemoryLRUCache
[Pt. II] MemoryNavigableLRUCache
[x] InMemoryWindowStore
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
This patch fixes a regression in the replica fetcher which occurs when the replica fetcher manager simultaneously calls `removeFetcherForPartitions`, removing the corresponding partitionStates, while a replica fetcher thread attempts to truncate the same partition(s) in `truncateToHighWatermark`. This causes an NPE which causes the fetcher to crash.
This change simply checks that the `partitionState` is not null first. Note that a similar guard exists in `truncateToEpochEndOffsets`.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Jason Gustafson <jason@confluent.io>
In the RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenCreated() and RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenDeleted() a race condition exists where the ConsumerRebalanceListener in the test modifies the list of subscribed topics when the condition for the test success is comparing the same array instance against expected values.
This PR should fix this race condition by using a CopyOnWriteArrayList which guarantees safe traversal of the list even when a concurrent modification is happening.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Previously the InMemoryKeyValue store would throw a ConcurrentModificationException if the store was modified beneath an open iterator. The TreeMap implementation was swapped with a ConcurrentSkipListMap for similar performance while supporting concurrent access.
Added one test to AbstractKeyValueStoreTest, no existing tests caught this.
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
This patch adds several new log messages to provide more information about errors during log dir movement and to make it clear when each partition movement is finished.
Reviewers: Jason Gustafson <jason@confluent.io>
* KAFKA-7962: StickyAssignor: throws NullPointerException during assignments if topic is deleted
https://issues.apache.org/jira/browse/KAFKA-7962
Consumer using StickyAssignor throws NullPointerException if a subscribed topic was removed.
* addressed vahidhashemian's comments
* lower NPath Complexity
* added a unit test
Second PR in series to inline the generic parameters of the following bytes stores
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
In some test cases it's desirable to instantiate a subclass of `ShutdownableThread` without starting it. Since most subclasses of `ShutdownableThread` put cleanup logic in `ShutdownableThread.shutdown()`, being able to call `shutdown()` on the non-running thread would be useful.
This change allows us to avoid blocking in `ShutdownableThread.shutdown()` if the thread's `run()` method has not been called. We also add a check that `initiateShutdown()` was called before `awaitShutdown()`, to protect against the case where a user calls `awaitShutdown()` before the thread has been started, and unexpectedly is not blocked on the thread shutting down.
Reviewers : Dhruvil Shah <dhruvil@confluent.io>, Jun Rao <junrao@gmail.com>
In order to debug problems with log directory reassignments, it is helpful to know when the fetcher thread begins moving a particular partition. This patch refactors the fetch logic so that we stick to a selected partition as long as it is available and log a message when a different partition is selected.
Reviewers: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
I found this defect while inspecting [KAFKA-7293: Merge followed by groupByKey/join might violate co-partitioning](https://issues.apache.org/jira/browse/KAFKA-7293); This flag is never used now. Instead, `KStreamImpl#repartitionRequired` is now covering its functionality.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
MINOR: Increase produce timeout for EmbeddedKafkaCluster to 120 seconds
Previous value was 500ms. This change gives more room to pass tests on systems with low resources running many parallel tests.
Reviewers: Randall Hauch <randall@confluent.io>
First PR in series to inline the generic parameters of the following bytes stores:
[x] InMemoryKeyValueStore
[ ] RocksDBWindowStore
[ ] RocksDBSessionStore
[ ] MemoryLRUCache
[ ] MemoryNavigableLRUCache
[ ] (awaiting merge) InMemoryWindowStore
A number of tests took advantage of the generic InMemoryKeyValueStore and had to be reworked somewhat -- this PR covers everything related to the in-memory key-value store.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
This fix is aiming for #2 issue pointed out within https://issues.apache.org/jira/browse/KAFKA-7672
In the current setup, we do offset checkpoint file write when EOS is turned on during #suspend, which introduces the potential race condition during StateManager #closeSuspend call. To mitigate the problem, we attempt to always write checkpoint file in #suspend call.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Whenever the consumer coordinator sends a response that doesn't match the client consumer subscription, we should check the subscription to see if it has changed. If it has, we can ignore the assignment and request a rebalance. Otherwise, we can throw an exception as before.
Testing strategy: create a mocked client that first sends an assignment response that doesn't match the client subscription followed by an assignment response that does match the client subscription.
Reviewers: Jason Gustafson <jason@confluent.io>
* In activeTasks.suspend, we should also close all restoring tasks as well. Closing restoring tasks would not require `task.close` as in `closeNonRunningTasks `, since the topology is not initialized yet, instead only state stores are initialized. So we only need to call `task.closeStateManager`.
* Also add @linyli001 's fix.
* Unit tests updated accordingly.
Reviewers: Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>
This is an update to the existing javadocs for KGroupedStream class.
Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>