Author: RichardYuSTUG <yohan.richard.yu2@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#4339 from ConcurrencyPractitioner/kafka-6238
Minor edits on description
1. Replaced KStreamKTableJoinIntegrationTest with the abstract based StreamTableJoinIntegrationTest. Added details on per-step verifications.
2. Minor renaming on GlobalKTableIntegrationTest.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>
Closes#4419 from guozhangwang/KMinor-join-integration-tests-II
This PR is the partial implementation for KIP-149. As the discussion for this KIP is still ongoing, I made a PR on the "safe" portions of the KIP (so that it can be included in the next release) which are 1) `ValueMapperWithKey`, 2) `ValueTransformerWithKeySupplier`, and 3) `ValueTransformerWithKey`.
Author: Jeyhun Karimov <je.karimov@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4309 from jeyhunkarimov/KIP-149_hope_last
- Add capability to create delegation token
- Add authentication based on delegation token.
- Add capability to renew/expire delegation tokens.
- Add units tests and integration tests
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#3616 from omkreddy/KAFKA-4541
This is the implementation of KIP-225.
It marks the previous metrics as deprecated in the documentation and adds new metrics using tags.
Testing verifies that both the new and the old metric report the same value.
Author: cmolter <cmolter@apple.com>
Reviewers: Jiangjie (Becket) Qin <becket.qin@gmail.com>
Closes#4362 from lahabana/kafka-5890
0. Rename `JoinIntegrationTest` to `StreamStreamJoinIntegrationTest`, which is only for KStream-KStream joins.
1. Extract the `AbstractJoinIntegrationTest` which is going to be used for all the join integration test classes, parameterized with and without caching.
2. Merge `KStreamRepartitionJoinTest.java` into `StreamStreamJoinIntegrationTest.java` with augmented stream-stream join.
3. Add `TableTableJoinIntegrationTest` with detailed per-step expected results and removed `KTableKTableJoinIntegrationTest`.
Findings of the integration test:
1. Confirmed KAFKA-4309 with caching turned on.
2. Found bug KAFKA-6398.
3. Found bug KAFKA-6443.
4. Found a bug that in CachingKeyValueStore, we would flush before putting the record into the underlying store, when the store is going to be used in the downstream processors with flushing it would result in incorrect results, fixed the issue along with this PR.
5. Consider a new optimization described in KAFKA-6286.
Future works including stream-table joins will be in other PRs.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>
Closes#4331 from guozhangwang/KMinor-join-integration-tests
When using Kafka Connect with a cluster that doesn't allow the user to
create topics (due to ACL configuration), Connect fails when trying to
create its internal topics even if these topics already exist. This is
incorrect behavior according to the documentation, which mentions that
R/W access should be enough.
This happens specifically when using Aiven Kafka, which does not permit
creation of topics via the Kafka Admin Client API.
The patch ignores the returned error, similar to the behavior for older
brokers that don't support the API.
A spinoff of original pull request #4340 for resolving conflicts.
Author: RichardYuSTUG <yohan.richard.yu2@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4413 from ConcurrencyPractitioner/kafka-6265-2
*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*
*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*
Author: Rohan Desai <desai.p.rohan@gmail.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4382 from rodesai/KAFKA-6383
1. Include the parent's queryable store name in KTable.filter if this operator is not materialized.
2. Augment InternalTopologyBuilder checking on null processor / store names from the enum.
3. Unit test.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Closes#4384 from guozhangwang/K6398-topology-builder-exception
Currently CachingKeyValueStore methods are synchronized at method level.
It seems we can use read lock for getter and write lock for put / delete methods.
For getInternal(), if the underlying thread is streamThread, the getInternal() may trigger eviction. This can be handled by obtaining write lock at the beginning of the method for streamThread.
The jmh patch is attached to JIRA:
https://issues.apache.org/jira/secure/attachment/12905140/6412-jmh.v1.txt
Author: tedyu <yuzhihong@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>
Closes#4372 from tedyu/6412
Different fix for problem addressed by https://github.com/apache/kafka/pull/1953. Should prevent the CLASSPATH environment variable from being prefixed by a single colon before the JVM is invoked in the run-class script, which will then prevent the current working directory from being unintentionally included in the classpath when using the Reflections library.
If the current working directory should still be included in the classpath, it just needs to be explicitly specified either with its fully-qualified pathname or as a single dot (".").
Author: Chris Egerton <chrise@confluent.io>
Reviewers: Randall Hauch <rhauch@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#4406 from C0urante/fix-colon-prefixed-classpath
* Implement MockAdminClient.deleteTopics
* Use MockAdminClient instead of MockKafkaAdminClientEnv in StreamsResetterTest
* Rename MockKafkaAdminClientEnv to AdminClientUnitTestEnv
* Use MockAdminClient instead of MockKafkaAdminClientEnv in TopicAdminTest
* Rename KafkaAdminClient to AdminClientUnitTestEnv in KafkaAdminClientTest.java
* Migrate StreamThreadTest to MockAdminClient
* Fix style errors
* Address review comments
* Fix MockAdminClient call
Reviewers: Matthias J. Sax <matthias@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Handle InvalidTypeIdException as NOT_IMPLEMENTED and add unit tests for all exceptions.
Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
We are closing the metricGroups created in a Worker, Source task and Sink task before populating them with new metrics. This helps in cases where an Exception is thrown when previously created groups were not cleaned up correctly.
Signed-off-by: Arjun Satish <arjunconfluent.io>
Author: Arjun Satish <arjun@confluent.io>
Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#4397 from wicknicks/KAFKA-6252
When the source file of `FileStreamSource` is a large file, `FileStreamSourceTask.poll()` will result in OOM. This pull request added `batch.size` parameter which can restrict the poll size.
*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*
*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*
Author: Study <ph.study@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#4356 from phstudy/KAFKA-4335
- Add missing locking/volatile in MetadataCache.aliveEndPoint
- Fix topic metadata not to throw BrokerNotAvailableException
when listeners are inconsistent. Add test verifying the fix. As
part of this fix, renamed Broker methods to follow Map
convention where the `get` version returns `Option`.
Reviewers: Jason Gustafson <jason@confluent.io>
LogContext to have two different implementations of Logger. One will be picked based on availability of LocationAwareLogger API.
Reviewers: Jason Gustafson <jason@confluent.io>
- Fix zk session state and session change rate metric names: type
should be SessionExpireListener instead of KafkaHealthCheck. Test
verifying the fix was included.
- Handle missing controller in controlled shutdown in the same way as if
the broker is not registered (i.e. retry after backoff).
- Restructure BrokerInfo to reduce duplication. It now contains a
Broker instance and the JSON serde is done in BrokerIdZNode
since `Broker` does not contain all the fields.
- Remove dead code from `ZooKeeperClient.initialize` and remove
redundant `close` calls.
- Move ACL handling and persistent paths definition from ZkUtils to
ZkData (and call ZkData from ZkUtils).
- Remove ZooKeeperClientWrapper and ZooKeeperClientMetrics from
ZkUtils (avoids metrics clash if third party users create a ZkUtils
instance in the same process as the broker).
- Introduce factory method in KafkaZkClient that creates
ZooKeeperClient and remove metric name defaults from
ZooKeeperClient.
- Fix a few instances where ZooKeeperClient was not closed in tests.
- Update a few TestUtils methods to use KafkaZkClient instead of
ZkUtils.
- Add test verifying SessionState metric.
- Various clean-ups.
Testing: mostly relying on existing tests, but added a couple
of new tests as mentioned above.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#4359 from ijuma/kafka-6320-kafka-health-zk-metrics-follow-up
When a log entry is appended to a Kafka topic using `KafkaLog4jAppender`, the producer.send operation may block waiting for metadata. This can result in deadlocks in a couple of scenarios if a log entry from the producer network thread is also at a log level that results in the entry being appended to a Kafka topic.
1. Producer's network thread will attempt to send data to a Kafka topic and this is unsafe since producer.send may block waiting for metadata, causing a deadlock since the thread will not process the metadata request/response.
2. `KafkaLog4jAppender#append` is invoked while holding the lock of the logger. So the thread waiting for metadata in the initial send will be holding the logger lock. If the producer network thread has.a log entry that needs to be appended, it will attempt to acquire the logger lock and deadlock.
This is a temporary workaround to avoid deadlocks in system tests by setting log level to WARN for `Metadata` in `VerifiableLog4jAppender`. The fix has been verified using the system tests log4j_appender_test.py which started failing when the info-level log entry was introduced.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Satish Duggana <satish.duggana@gmail.com>, tedyu <yuzhihong@gmail.com>
Closes#4375 from rajinisivaram/KAFKA-6415-log4jappender
Increase commit interval to make it less likely that we flush the cache in-between.
To make it fool-proof, only compare the "final" result records if cache is enabled.
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4364 from mjsax/kafka-6256-flaky-kstream-ktable-join-with-caching-test
* Removed code duplicate from GlobalProcessorContextImpl and ProcessorContextImpl to parent class AbstractProcessorContext
* Exchanged concrete implementations with interfaces to make code more maintainable
* Refactored major code duplicates in InternalTopologyBuilder
* Formatted function parameters as per code review
Added final to code introduced in this PR
* Added missing finals to putNodeGroupName function
Rearranged parameters for resetTopicsPattern function
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
* ensure topics are created with correct partitions BEFORE building the metadata for our stream tasks
* Added a test case. The test should fail with the old logic, because:
While stream-partition-assignor-test-KSTREAM-MAP-0000000001-repartition is created correctly with four partitions, the StreamPartitionAssignor will only assign three tasks to the topic. Test passes with the new logic.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>, Ted Yu <yuzhihong@gmail.com>
* Return offset of next record of records left after restore completed
* Changed check for restoring partition to remove the "+1" in the guard condition
Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
* KAFKA-6383: complete shutdown for CREATED StreamThreads
When transitioning StreamThread from CREATED to PENDING_SHUTDOWN
free up resources from the caller, rather than the stream thread,
since in this case the stream thread was never actually started.
Have StreamThread.setState return the old state. If the old state is
CREATED in StreamThread.shutdown() then start the thread so that it
can free the resources owned by the StreamThread.
Add a KafkaStreams test to verify that the producer gets closed even
if KafkaStreams was not started
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
We should remove the map entry from mbeans if it becomes
empty during metric removal.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Satish Duggana <satish.duggana@gmail.com>, Ismael Juma <ismael@juma.me.uk>