Add UUID to the list of types documented in Type#toHtml.
Type, Protocol, ArrayOf: use Type#isArray and Type#arrayElementType rather than typecasting to handle arrays. This is cleaner. It will also make it easier for us to add compact arrays (as specified by KIP-482) as a new array type distinct from the old array type.
Add MessageUtil#byteBufferToArray, as well as a test for it. We will need this for handling tagged fields of type "bytes".
Schema#Visitor: we don't need a separate function overload for visiting arrays. We can just call "visit(Type field)".
TestUUID.json: reformat the JSON file to match the others.
ProtocolSerializationTest: improve the error messages on failure. Check that each type has the name we expect it to have.
Reviewers: David Arthur <mumrah@gmail.com>, José Armando García Sancio <jsancio@gmail.com>, Vikas Singh <soondenana@users.noreply.github.com>
Some work needs to be done in Streams before we can incorporate cooperative rebalancing.
This PR lays the groundwork for it by doing some refactoring, including a behavioral change that affects eager ("normal") rebalancing as well: will no longer suspend standbys in onPartitionsRevoked, instead we just close any that were reassigned in onPartitionsAssigned
Reviewers: Bruno Cadonna <bruno@confluent.io>, Boyang Chen <boyang@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
A metric recorder runs in it own thread and regularly records RocksDB metrics from
RocksDB's statistics. For segmented state stores the metrics are aggregated over the
segments.
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
1. Add the overloaded functions.
2. Update the code in Streams to use the batch API for better latency (this applies to both active StreamsTask for initialize the offsets, as well as the StandbyTasks for updating offset limits).
3. Also update all unit test to replace the deprecated APIs.
Reviewers: Christopher Pettitt <cpettitt@confluent.io>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Bill Bejeck <bill@confluent.io>
This is the last PR for the KIP-307.
NOTE : PR 6412 should be merge first
Thanks a lot for the review.
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
I realized some flaky tests failed at setup or calls that tries to create offset topics, and I think using one partition and one replica would be sufficient in these cases.
Reviewers: Bill Bejeck <bill@confluent.io>
Cache-level metrics are refactor according to KIP-444:
tag client-id changed to thread-id
name hitRatio changed to hit-ratio
made backward compatible by using streams config built.in.metrics.version
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
Part of supporting KIP-213 ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable ). Murmur3 hash is used as a hashing mechanism in KIP-213 for the large range of uniqueness. The Murmur3 class and tests are ported directly from Apache Hive, with no alterations to the code or dependencies.
Author: Adam Bellemare <adam.bellemare@wishabi.com>
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#7271 from bellemare/murmur3hash
KIP-455 (18d4e57f6e8c67ffa7937fc855707d3a03cc165a) bumped the LeaderAndIsr version to 3 but did not change the Controller code to actually send the new version. The ControllerChannelManagerTest had a bug which made it assert wrongly, hence why it did not catch it. This patch fixes said test.
Because the new fields in LeaderAndIsr are not used yet, the gap was not caught by integration tests either.
Reviewers: Jason Gustafson <jason@confluent.io>
Reviewers: Bruno Cadonna <bruno@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
It's useful to know when the cleaner runs what the last modified time
of the segment and the deletion horizon is. The current log message
only allows you to infer that one is greater than the other.
Reviewers: Jun Rao <junrao@gmail.com>
This PR makes two changes to code in the ReplicaManager.updateFollowerFetchState path, which is in the hot path for follower fetches. Although calling ReplicaManager.updateFollowerFetch state is inexpensive on its own, it is called once for each partition every time a follower fetch occurs.
1. updateFollowerFetchState no longer calls maybeExpandIsr when the follower is already in the ISR. This avoid repeated expansion checks.
2. Partition.maybeIncrementLeaderHW is also in the hot path for ReplicaManager.updateFollowerFetchState. Partition.maybeIncrementLeaderHW calls Partition.remoteReplicas four times each iteration, and it performs a toSet conversion. maybeIncrementLeaderHW now avoids generating any intermediate collections when updating the HWM.
**Benchmark results for Partition.updateFollowerFetchState on a r5.xlarge:**
Old:
```
1288.633 ±(99.9%) 1.170 ns/op [Average]
(min, avg, max) = (1287.343, 1288.633, 1290.398), stdev = 1.037
CI (99.9%): [1287.463, 1289.802] (assumes normal distribution)
```
New (when follower fetch offset is updated):
```
261.727 ±(99.9%) 0.122 ns/op [Average]
(min, avg, max) = (261.565, 261.727, 261.937), stdev = 0.114
CI (99.9%): [261.605, 261.848] (assumes normal distribution)
```
New (when follower fetch offset is the same):
```
68.484 ±(99.9%) 0.025 ns/op [Average]
(min, avg, max) = (68.446, 68.484, 68.520), stdev = 0.023
CI (99.9%): [68.460, 68.509] (assumes normal distribution)
```
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
* log lock acquistion failures on the state store
* Document required uniqueness of state.dir path
* Move bunch of log calls around task state changes to DEBUG
* More readable log messages during partition assignment
Reviewers: Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Given that the tests do not create clusters larger than 3, we do not gain much by creating 50 partitions for that topic. Reducing it should slightly increase test startup and shutdown speed.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bbejeck@gmail.com>
The streams config built.in.metrics.version is needed to add metrics in
a backward-compatible way. However, not in every location where metrics are
added a streams config is available to check built.in.metrics.version. Thus,
the config value needs to be exposed through the StreamsMetricsImpl object.
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
This adds an administrative API to delete consumer offsets of a group as well as extends the mechanism to expire offsets of consumer groups.
It makes the group coordinator aware of the set of topics a consumer group (protocol type == 'consumer') is actively subscribed to, allowing offsets of topics which are not actively subscribed to by the group to be either expired or administratively deleted. The expiration rules remain the same.
For the other groups (non-consumer), the API allows to delete offsets when the group is empty and the expiration remains the same.
Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Jason Gustafson <jason@confluent.io>
Replace the `<table>` elements by `<ul>` so the full page width can be used for the configuration descriptions instead of only a very narrow column. I moved the other fields (Type, Default Value, etc) below each entry.
Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>
Key changes include:
1. Moves general offset limit updates down to StandbyTask.
2. Updates offsets for StandbyTask at most once per commit and only when we need and updated offset limit to make progress.
3. Avoids writing an 0 checkpoint when StandbyTask.update is called but we cannot apply any of the records.
4. Avoids going into a restoring state in the case that the last checkpoint is greater or equal to the offset limit (consumer committed offset). This needs special attention please. Code is in
StoreChangelogReader.
5. Does update offset limits initially for StreamTask because it provides a way to prevent playing to many records from the changelog (also the input topic with optimized topology).
NOTE: this PR depends on KAFKA-8816, which is under review separately. Fortunately the changes involved are few. You can focus just on the KAFKA-8755 commit if you prefer.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
The purpose of this PR is to add static membership support for range assignor. More details for the motivation in here.
Similar to round robin assignor, if we are capable of persisting member identity across generations, we will reach a much more stable assignment.
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>, Bruno Cadonna <bruno@confluent.io>
If the topic already exists, `handleCreateTopicsRequest` should return TopicExistsException even given an invalid config (replication factor for instance).
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Jason Gustafson <jason@confluent.io>
Implement the revisions to the controller state machine and reassignment logic needed for KIP-455.
Add the addingReplicas and removingReplicas field to the topics ZNode.
Deprecate the methods initiating a reassignment via direct ZK access in KafkaZkClient.
Add ControllerContextTest, and add some test cases to ReassignPartitionsClusterTest.
Add a note to upgrade.html recommending not initiating reassignments during an upgrade.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Viktor Somogyi <viktorsomogyi@gmail.com>
When we have an unhandled exception in the request handler, we print some details about the request such as the api key and payload. It is also useful to see the version of the request which is not always apparent from the request payload.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
I bumped into this flaky test while working on another PR. It's a bit different from the reported PR, where it actually timed out at parsing localhost:port already. I think what we should really test is that the closing call can complete, so I removed the whole test timeout and add the timeout for the shutdown latch instead.
Reviewers: Jason Gustafson <jason@confluent.io>, cpettitt-confluent <53191309+cpettitt-confluent@users.noreply.github.com>
This patch adds an atomic counter in the test to ensure we have processed all the events before we assert the metrics. There was a race condition with the previous assertion, which asserted that the event queue is empty before checking the metrics.
Reviewers: Jason Gustafson <jason@confluent.io>
The previous approach to testing KAFKA-8412 was to look at the logs and
determine if an error occurred during close. There was no direct way to
detect than an exception occurred because the exception was eaten in
AssignedTasks.close. In the PR for that ticket (#7207) it was
acknowledged that this was a brittle way to test for the exception. We
now see occasional failures because an unrelated ERROR level log entry
is made while closing the task.
This change eliminates the brittle log checking by rethrowing any time
an exception occurs in close, even when a subsequent unclean close
succeeds. This has the potential benefit of uncovering other supressed
exceptions down the road.
I've verified that even with us rethrowing on closeUnclean that all
tests pass.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
We need stacktrace of the error to understand the root cause and to trouble shoot the underlying problem.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
This PR adds supporting features for static membership. It could batch remove consumers from the group with provided group.instance.id list.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
This change adds a command line option to the `ducker-ak up' command to enable exposing ports from docker containers. The exposed ports will be mapped to the ephemeral ports on the host. The option is called `expose-ports' and can take either a single value (like 5005) or a range (like 5005-5009). This port will then exposed from each docker container that ducker-ak sets up.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@users.noreply.github.com>
This creates a test that generates sustained connections against Kafka. There
are three different components we can stress with this, KafkaConsumer,
KafkaProducer, and AdminClient. This test tries use minimal bandwidth per
connection to reduce overhead impacts.
This test works by creating a threadpool that creates connections and then
maintains a central pool of connections at a specified keepalive rate. The
keepalive action varies by which component is being stressed:
* KafkaProducer: Sends one single produce record. The configuration for
the produce request uses the same key/value generator as the ProduceBench
test.
* KafkaConsumer: Subscribes to a single partition, seeks to the end, and
then polls a minimal number of records. Each consumer connection is its
own consumer group, and defaults to 1024 bytes as FETCH_MAX_BYTES to keep
traffic to a minimum.
* AdminClient: Makes an API call to get the nodes in the cluster.
NOTE: This test is designed to be run alongside a ProduceBench test for a
specific topic, due to the way the Consumer test polls a single partition.
There may be no data returned by the consumer test if this is run on its own.
The connection should still be kept alive, but with no data returned.
Author: Scott Hendricks <scott.hendricks@confluent.io>
Reviewers: Stanislav Kozlovski, Gwen Shapira
Closes#7289 from scott-hendricks/trunk