My top 2 reasons for visiting the Kafka docs are to:
- View configurations
- View metrics
This PR aims to improve the user experience for viewing metrics:
- Add href links to the `Monitoring` section of the Table of Contents so users do not need to scroll or Ctrl/Cmd-F to find specific metric details (Monitoring section has grown large as more component & metrics are added)
Author: lu.kevin@berkeley.edu <kelu@paypal.com>
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#5511 from KevinLiLu/feature/minor-improve-docs
The restart logic for TTLs in `WorkerConfigTransformer` was broken when trying to make it toggle-able. Accessing the toggle through the `Herder` causes the same code to be called recursively. This fix just accesses the toggle by simply looking in the properties map that is passed to `WorkerConfigTransformer`.
Author: Robert Yokota <rayokota@gmail.com>
Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#5914 from rayokota/KAFKA-7620
See https://github.com/spotbugs/spotbugs/issues/756 for details on
the false positives affecting try with resources. An example is:
> RCN | Nullcheck of fc at line 629 of value previously dereferenced in
> org.apache.kafka.common.utils.Utils.readFileAsString(String, Charset)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
* KAFKA-7367: Ensure stateless topologies don't require disk access
* KAFKA-7367: Streams should not create state store directories unless they are needed.
* Addressed the review comments.
* Addressed the review-2 comments.
* Fixed FileAlreadyExistsException
* Addressed the review-3 comments.
* Resolved the conflicts.
This is a new system test testing for optimizing an existing topology. This test takes the following steps
1. Start a Kafka Streams application that uses a selectKey then performs 3 groupByKey() operations and 1 join creating four repartition topics
2. Verify all instances start and process data
3. Stop all instances and verify stopped
4. For each stopped instance update the config for TOPOLOGY_OPTIMIZATION to all then restart the instance and verify the instance has started successfully also verifying Kafka Streams reduced the number of repartition topics from 4 to 1
5. Verify that each instance is processing data from the aggregation, reduce, and join operation
Stop all instances and verify the shut down is complete.
6. For testing I ran two passes of the system test with 25 repeats for a total of 50 test runs.
All test runs passed
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Add the final batch of metrics from KIP-328
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
KAFKA-7597: Add configurable transaction support to ProduceBenchWorker. In order to get support for serializing Optional<> types to JSON, add a new library: jackson-datatype-jdk8. Once Jackson 3 comes out, this library will not be needed.
Reviewers: Colin McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
Changes made as part of this [KIP-374](https://cwiki.apache.org/confluence/x/FgSQBQ) and [KAFKA-7418](https://issues.apache.org/jira/browse/KAFKA-7418)
- Checking for empty args or help option in command file to print Usage
- Added new class to enforce help option to all commands
- Refactored few lines (ex `PreferredReplicaLeaderElectionCommand`) to
make use of `CommandDefaultOptions` attributes.
- Made the changes in help text wordings
Run the unit tests in local(Windows) few Linux friendly tests are failing but
not any functionality, verified `--help` and no option response by running
Scala classes, since those all are having `main` method.
Author: Srinivas Reddy <srinivas96alluri@gmail.com>
Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>
Author: Srinivas <srinivas96alluri@gmail.com>
Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Manikumar Reddy <manikumar.reddy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
Closes#5910 from mrsrinivas/KIP-374
The Kafka Streams system tests fail with some regularity due to a timeout starting the broker.
The initial start is quite quick, but many of our tests involve stopping and restarting nodes with data already loaded, and also while processing is ongoing.
Under these conditions, it seems to be normal for the broker to take about 25 seconds to start, which makes the 30 second timeout pretty close for comfort.
I have seen many test failures in which the broker successfully started within a couple of seconds after the tests timed out and already initiated the failure/shut-down sequence.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
While metrics like Min, Avg and Max make sense to respective use Double.MAX_VALUE, 0.0 and Double.MIN_VALUE as default values to ease computation logic, exposing those values makes reading them a bit misleading. For instance, how would you differentiate whether your -avg metric has a value of 0 because it was given samples of 0 or no samples were fed to it?
It makes sense to standardize on the output of these metrics with something that clearly denotes that no values have been recorded.
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
In TopologyTestDriver constructor set non-null topic; and in unit test intentionally turn on caching to verify this case.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
…eturned by poll() if there are any records to return
The MockConsumer behaves unlike the real consumer in that it can return a non-empty ConsumerRecords from poll, that also has a count of 0. This change makes the MockConsumer only add partitions to the ConsumerRecords if there are records to return for those partitions.
A unit test in MockConsumerTest demonstrates the issue.
Author: Stig Rohde Døssing <stigdoessing@gmail.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#5901 from srdo/KAFKA-7616
This pull request removes the final reference to KStreamWindowReducer and replaces it with KStreamWindowAggregate
Signed-off-by: Samuel Hawker sam.b.hawker@gmail.com
contribution is my original work and that I license the work to the project under the project's open source license.
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Improve the default group id behavior by:
* changing the default consumer group to null, where no offset commit or fetch, or group management operations are allowed
* deprecating the use of empty (`""`) consumer group on the client
Reviewers: Jason Gustafson <jason@confluent.io>
ReplicaFetcherThread.shutdown attempts to close the fetcher's Selector while the thread is running. This in unsafe and can result in `Selector.close()` failing with an exception. The exception is caught and logged at debug level, but this can lead to socket leak if the shutdown is due to dynamic config update rather than broker shutdown. This PR changes the shutdown logic to close Selector after the replica fetcher thread is shutdown, with a wakeup() and flag to terminate blocking sends first.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
The removed tests have counterparts covered by SuppressScenarioTest using the TopologyTestDriver.
This will speed up the build and improve stability in the CPU-constrained Jenkins environment.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
The documentation version of 2.1.0 RC1 is still at 2.0. Updated it to 2.1.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Dong Lin <lindong28@gmail.com>
Closes#5916 from vahidhashemian/minor/update_version_in_documentation_for_2.1.0
Callers of 1) Windows#until, 2) Windows#of, 3) Serialized are replaced when possible with the new APIs.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>
Some connector configs may be sensitive, so we should avoid logging them.
Reviewers: Alex Diachenko, Dustin Cote <dustin@confluent.io>, Jason Gustafson <jason@confluent.io>
The function `setup_producer_and_consumer` is unused in the system test
framework, which incorrectly suggests subclasses should implement
it. It is not required or even referenced by the framework, so
the requirement should be removed.
Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Jason Gustafson <jason@confluent.io>
Add threads with separate consumers to ConsumeBenchWorker. Update the Trogdor test scripts and documentation with the new functionality.
Reviewers: Colin McCabe <cmccabe@apache.org>
We are seeing some timeouts in tests which depend on the awaitCommitCallback (e.g.
SaslMultiMechanismConsumerTest.testCoordinatorFailover). After some investigation,
we found that it is caused by a disconnect when attempting the async commit.
To fix the problem, we have added simple retry logic to the test utility.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Ismael Juma <ismael@juma.me.uk>
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
- `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
- `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
- `scala.Console` (deprecated in Scala 2.11)
- `PartitionData` taking a timestamp (one of them seemingly a bug)
- `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method
Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
There seems to be no reason to keep this around since it is not used outside
of testing and AbstractIterator is basically the same thing.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Instead of calling deleteSnapshotsAfterRecoveryPointCheckpoint for allLogs, invoking it only for the logs being truncated.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
EasyMock 4.0.x includes a change that relies on the caller for inferring
the return type of mock creator methods. Updated a number of Scala
tests for compilation and execution to succeed.
The versions of EasyMock and PowerMock in this PR include full support
for Java 11.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
* Add logic to retry the BrokerInfo registration into ZooKeeper
In case the ZooKeeper session has been regenerated and the broker
tries to register the BrokerInfo into Zookeeper, this code deletes
the current BrokerInfo from Zookeeper and creates it again, just if
the znode ephemeral owner belongs to the Broker which tries to register
himself again into ZooKeeper
* Add test to validate the BrokerInfo re-registration into ZooKeeper
The problem is the concurrent metadata updates in the foreground and in the heartbeat thread. Changed the code to use ConsumerNetworkClient.poll, which enforces mutual exclusion when accessing the underlying client.
[KAFKA-7431](https://issues.apache.org/jira/browse/KAFKA-7431)
Changes made to improve the code readability:
- Removed `throws Exception` from the place where there won't be an
exception
- Removed type arguments where those can be inferred explicitly by compiler
- Rewritten Anonymous classes to Java 8 with lambdas
Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>
Author: Srinivas Reddy <srinivas96alluri@gmail.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Ryanne Dolan <ryannedolan@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#5681 from mrsrinivas/cleanup-connect-uts
Currently PushHttpMetricsReporter will convert value from KafkaMetric.metricValue() to double. This will not work for non-numerical metrics such as version in AppInfoParser whose value can be string. This has caused issue for PushHttpMetricsReporter which in turn caused system test kafkatest.tests.client.quota_test.QuotaTest.test_quota to fail.
Since we allow metric value to be object, PushHttpMetricsReporter should also read metric value as object and pass it to the http server.
Author: Dong Lin <lindong28@gmail.com>
Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#5886 from lindong28/KAFKA-7560
This PR avoids sending out full UpdateMetadataReuqest in the following scenarios:
1. On broker startup, send out full UpdateMetadataRequest to newly added brokers and only send out UpdateMetadataReuqest with empty partition states to existing brokers.
2. On broker failure, if it doesn't require leader election, only include the states of partitions that are hosted by the dead broker(s) in the UpdateMetadataReuqest instead of including all partition states.
This PR also introduces a minor optimization in the MetadataCache update to avoid copying the previous partition states upon receiving UpdateMetadataRequest with no partition states.
Reviewers: Jun Rao <junrao@gmail.com>