Technically this does not strictly adhere to RFC-952 however it is valid for domain names, urls and uris so we should loosen the requirements a tad.
Author: Ryan Pridgeon <ryan.n.pridgeon@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1856 from rnpridgeon/KAFKA-3719
Author: Jiangjie Qin <becket.qin@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#1852 from becketqin/KAFKA-4148
When the consumer is not subscribed to any topic or, in the case of manual assignment, is not assigned any partition, calling `poll()` should raise an exception.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1839 from vahidhashemian/KAFKA-4135
Implementation and tests for secure quotas at <user> and <user, client-id> levels as described in KIP-55. Also adds dynamic default quotas for <client-id>, <user> and <user-client-id>. For each client connection, the most specific quota matching the connection is used, with user quota taking precedence over client-id quota.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1753 from rajinisivaram/KAFKA-3492
This PR implements KIP-78:Cluster Identifiers [(link)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id#KIP-78:ClusterId-Overview) and includes the following changes:
1. Changes to broker code
- generate cluster id and store it in Zookeeper
- update protocol to add cluster id to metadata request and response
- add ClusterResourceListener interface, ClusterResource class and ClusterMetadataListeners utility class
- send ClusterResource events to the metric reporters
2. Changes to client code
- update Cluster and Metadata code to support cluster id
- update clients for sending ClusterResource events to interceptors, (de)serializers and metric reporters
3. Integration tests for interceptors, (de)serializers and metric reporters for clients and for protocol changes and metric reporters for broker.
4. System tests for upgrading from previous versions.
Author: Sumit Arrawatia <sumit.arrawatia@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1830 from arrawatia/kip-78
This applies to Replication Quotas
based on KIP-73 [(link)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas) originally motivated by KAFKA-1464.
System Tests Run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/544/
**This first PR demonstrates the approach**.
**_Overview of Change_**
The guts of this change are relatively small. Throttling occurs on both leader and follower sides. A single class tracks the throttled throughput in and out of each broker (**_ReplicationQuotaManager_**).
On the follower side, the Follower Throttled Rate is calculated as fetch responses arrive. Then, before the next fetch request is sent, we check to see if the quota is violated, removing throttled partitions from the request if it is. This is all encapsulated in a few lines of code in the **_ReplicaFetcherThread_**. There is existing code to handle temporal back off, if the request ends up being empty.
On the leader side it's a little more complex. When a fetch request arrives in the leader, it is built, partition by partition, in **_ReplicaManager.readFromLocalLog_**. As we put each partition into the fetch response, we check if the total size fits in the current quota. If the quota is exceeded, the partition will not be added to the fetch response. Importantly, we don't increase the quota at this point, we just check to see if the bytes will fit.
Now, if there aren't enough bytes to send the response immediately, which is common if we're catching up and throttled, then the request will be put in purgatory. I've added some simple code to **_DelayedFetch_** to handle throttled partitions (throttled partitions are checked against the quota, rather than the messages available in the log).
When the delayed fetch completes, and exits purgatory, _**ReplicaManager.readFromLocalLog**_ will be called again. This is why _**ReplicaManager.readFromLocalLog**_ does not actually increase the quota, it just checks whether enough bytes are available for a partition.
Finally, when there are enough bytes to be sent, or the delayed fetch times out, the response will be sent. Before it is sent the throttled-outbound-rate is increased, based on the size of throttled partitions being sent. This is at the end of _**KafkaApis.handleFetchRequest**_, exactly where client quotas are recorded.
There is an acceptance test which asserts the whole throttling process stabilises on the desired value. This covers a number of use cases including many-to-many replication. See **_ReplicationQuotaTest_**.
Note:
It should be noted that this protocol can over-request. The request is built, based on the quota at time t1 (_ReplicaManager.readFromLocalLog_). The bytes in the response are recorded at time t2 (end of _KafkaApis.handleFetchRequest_), where t2 > t1. For this reason I originally included an OverRequestedRate as a JMX metric, but testing has not seen revealed any obvious issue. Over-requesting is quickly compensated by subsequent requests, stabilising close to the quota value.
_**Main stuff left to do:**_
- The fetch size is currently unbounded. This will be addressed in KIP-74, but we need to ensure this ensures requests don’t go beyond the throttle window.
- There are two failures showing up in the system tests on this branch: StreamsSmokeTest.test_streams (which looks like it fails regularly) and OffsetValidationTest.test_broker_rolling_bounce (which I need to look into)
_**Stuff left to do that could be deferred:**_
- Add the extra metrics specified in the KIP.
- There are no system tests.
- There is no validation for the cluster size / throttle combination that could lead to ISR dropouts
Author: Ben Stopford <benstopford@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>
Closes#1776 from benstopford/rep-quotas-v2
Followed the same naming pattern as the producer sender thread.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson
Closes#1854 from ijuma/heartbeat-thread-name
Here is the patch on github ijuma.
Acquiring the consumer lock (the single thread access controls) requires that the consumer be open. I changed the closed variable to be volatile so that another thread's writes will visible to the reading thread.
Additionally, there was an additional check if the consumer was closed after the lock was acquired. This check is no longer necessary.
This is my original work and I license it to the project under the project's open source license.
Author: Tim Brooks <tim@uncontended.net>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1637 from tbrooks8/KAFKA-2311
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Dan Norwood <norwood@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1821 from hachikuji/KAFKA-3807
This PR changes topic subscription semantics so a change in subscription does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update the assigned partitions.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson
Closes#1726 from vahidhashemian/KAFKA-4033
change console producer default acks to 1, update acks docs. Also added the -1 config to the acks docs since that question comes up often. ijuma and vahidhashemian, does this look reasonable to you?
Author: Dustin Cote <dustin@confluent.io>
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1795 from cotedm/KAFKA-3129
- use AdminTool to check for active consumer group
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1767 from mjsax/kafka-4058-trunk
The commit here changes the log level of a log message from WARN to DEBUG.
As noted in the mail discussion here https://www.mail-archive.com/devkafka.apache.org/msg56035.html,
in a pretty straightforward/typical and valid setup, the broker logs get
flooded with the following message:
[2016-09-02 08:07:13,773] WARN SSL peer is not authenticated, returning ANONYMOUS instead (org.apache.kafka.common.network.SslTransportLayer)
Author: Jaikiran Pai <jaikiran.pai@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1825 from jaikiran/ssl-log-level
Also add tests and make `Crc32.update` perform the same argument checks as
`java.util.zip.CRC32`.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1672 from ijuma/record-is-valid-should-be-more-robust
Modifies example in doc change
Author: ybyzek <ybyzek@users.noreply.github.com>
Reviewers: Guozhang Wang, Ismael Juma
Closes#1805 from ybyzek/onComplete_doc
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1793 from ijuma/include-request-header-if-request-correlation-fails
Change cleanup.policy to accept a comma separated list of valid policies.
Updated LogCleaner.CleanerThread to also run deletion for any topics configured with compact,delete.
Ensure Log.deleteSegments only runs when delete is enabled.
Additional Integration and unit tests to cover new option
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#1742 from dguy/kafka-4015
ijuma - Making the change against trunk based on your suggestions to have the stream closing handled in the private RecordIterator constructor which I understand is only to be used only if the block of message(s) are compressed.
Author: William Yu <wyu@unified.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1760 from wiyu/compressor_memory_leak_in_fetcher
Configure default classes using class objects instead of class names, enable configurable lists of classes to be specified as class objects, add tests for different classloader configurations.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1421 from rajinisivaram/KAFKA-3680
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <me@ewencp.org>
Closes#1746 from guozhangwang/K4049-RegexSourceIntegrationTest-failure
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1763 from rajinisivaram/KAFKA-4066
In case of unknown configs, only list the name without the value
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Jaikiran, Gwen Shapira, Grant Henke, Ryan Pridgeon, Dustin Cote
Closes#1759 from mimaison/KAFKA-4056
Add an optional configuration for the SecureRandom PRNG implementation, with the default behavior being the same (use the default implementation in the JDK/JRE).
Author: Todd Palino <Todd Palino>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Jiangjie Qin <becket.qin@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#1747 from toddpalino/trunk
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Dana Powers, Gwen Shapira
Closes#1732 from vahidhashemian/minor/clarify_producer_config_documentation
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#1627 from hachikuji/KAFKA-3888
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Michael G. Noll <michael@confluent.io>, Greg Fodor <gfodor@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1530 from guozhangwang/K3769-per-thread-metrics
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1720 from hachikuji/KAFKA-4034
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <me@ewencp.org>, Jun Rao <junrao@gmail.com>
Closes#1616 from granthenke/delete-wire-new
guozhangwang enothereska mjsax miguno please take a look. A few things that need to be clarified
1. I've added StreamsConfig.USER_ENDPOINT_CONFIG, but should we have separate configs for host and port or is this one config ok?
2. `HostState` in the KIP has a byte[] field - not sure why and what it would be populated with
3. I've changed the API to return `List<KafkaStreamsInstance>` as opposed to `Map<HostInfo, Set<TaskMetadata>>` as i find this far more intuitive to work with.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Matthias J. Sax, Michael G. Noll, Eno Thereska, Guozhang Wang
Closes#1576 from dguy/kafka-3914v2
It previously hardcoded it.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1613 from ijuma/kafka-3954-consumer-internal-topics-from-broker
Also include a few minor clean-ups.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Manikumar reddy O <manikumar.reddy@gmail.com>, Grant Henke <granthenke@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1623 from ijuma/fix-zk-inconsistent-security-check
ijuma said that it would make sense to split out this work from KAFKA-3234, since KAFKA-3234 had both a mechanical change (generating docs) as well as a change requiring discussion (deprecating/renaming config options).
jjkoshy, I hope you don't mind that I took over this work. It's been 3 months since the last activity on KAFKA-3234, so I thought it would be okay to take over.
This work is essentially is the first 5-6 commits from Joel's https://github.com/apache/kafka/pull/907. However, since I'm not very experienced with git, I didn't do a direct merge/rebase, but instead largely hand-merged it. I did some minor cleanup. All credit goes to Joel, all blame goes to me. :)
For reference, I attached the auto-generated configuration.html file (as a PDF, because github won't let me attache html).
[configuration.pdf](https://github.com/apache/kafka/files/323901/configuration.pdf)
This is my first time writing Scala, so let me know if there are any changes needed.
I don't know who is the right person to review this. ijuma, can you help me redirect this to the appropriate person? Thanks.
Author: James Cheng <jylcheng@yahoo.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1527 from wushujames/generate_topic_docs
1. The IllegalStateException is actually thrown from testCloseWithZeroTimeoutFromSenderThread() due to a bug. We call producer.close() in the callback. Once the first callback is called, producing records in the callback will hit the IllegalStateException. This only pollutes the output, but doesn't fail the test. I fixed this by only calling producer.send() in the first callback.
2. It's not clear which test throws TimeoutException and it's not reproducible locally. One thing is that the error message in TimeoutException is mis-leading since the timeout is not necessarily due to metadata. Improved this by making the error message in TimeoutException clearer.
3. It's not clear what actually failed testSendNonCompressedMessageWithCreateTime(). One thing I found is that since we set the linger time to MAX_LONG and are sending small messages, those produced messages won't be drained until we call producer.close(10000L, TimeUnit.MILLISECONDS). Normally, 10 secs should be enough for the records to be sent. My only hypothesis is that since SSL is more expensive, occasionally, 10 secs is still not enough. So, I bumped up the timeout from 10 secs to 20 secs.
Author: Jun Rao <junrao@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1703 from junrao/kafka-3875
- Removed an unnecessary annotation
- Parameterized a couple of raw types
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1675 from mimaison/warnings
…nly be 0, 1, or -1
Rephrased the documentation string for the Produce request
Updated the acks configuration docs to state that -1, 0, and 1 are the only allowed values
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Gwen Shapira
Closes#1680 from mimaison/KAFKA-3946
InvalidRecordException is not part of the public API so go back
to the behaviour before ff557f02 for now.
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1676 from hachikuji/raise-kafka-exception-on-invalid-crc