1. At the beginning of assign, we first check that all the non-repartition source topics are included in the metadata. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it.
2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code.
Author: tedyu <yuzhihong@gmail.com>
Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
Closes#5322 from tedyu/trunk
The SASL/OAUTHBEARER client response as currently implemented in OAuthBearerSaslClient sends the valid gs2-header "n,," but then sends the "auth" key and value immediately after it.
This does not conform to the specification because there is no %x01 after the gs2-header, no %x01 after the auth value, and no terminating %x01. Fixed this and the parsing of the client response in
OAuthBearerSaslServer, which currently allows the malformed text. Also updated to accept and ignore unknown properties as required by the spec.
Reviewers: Stanislav Kozlovski <familyguyuser192@windowslive.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
1. Remove MinTimestampTracker and its TimestampTracker interface.
2. In RecordQueue, keep track of the head record (deserialized) while put the rest raw bytes records in the fifo queue, the head record as well as the partition timestamp will be updated accordingly.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
- Replace adminZkClient.createOrUpdateTopicPartitionAssignmentPathInZK calls with TestUtils.createTopic wherever applicable
- Replace adminZkClient.createTopic calls with TestUtils.createTopic wherever applicable
- Move non-deprecated tests to other test classes and deprecate AdminTest.scala
- Remove duplicate tests between AdminTest and AdminZkClientTest
Author: Manikumar Reddy <manikumar.reddy@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Dong Lin <lindong28@gmail.com>
Closes#5303 from omkreddy/topiccreate
SSL `close_notify` from broker connection close was processed as a handshake failure in clients while unwrapping the message if a handshake is in progress. Updated to handle this as a retriable IOException rather than a non-retriable SslAuthenticationException to avoid authentication exceptions in clients during rolling restart of brokers.
Reviewers: Ismael Juma <ismael@juma.me.uk>
The default server.properties file now contains the log.dirs setting and not log.dir anymore.
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Katherine Farmer <kfarme3@uk.ibm.com>
Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Sriharsha Chintalapani <sriharsha@apache.org>
…egal char and generates InvalidTopicException
If config parameter max.block.ms config parameter is set to a non-zero value,
KafkaProducer.send() blocks for the max.block.ms time if topic name has illegal
char or is invalid.
Wrote a unit test that verifies the appropriate exception is returned when
performing a get on the returned future by KafkaProducer.send().
Author: Ahmed Al Mehdi <aalmehdi@aalmehdi-ld1.linkedin.biz>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Manikumar Reddy O <manikumar.reddy@gmail.com>
Closes#5247 from ahmedha/KAFKA-5098
This PR uses bulk loading for recovering RocksDBWindowStore, same as RocksDBStore.
Reviewers: Boyang Chen <bchen11@outlook.com>, Shawn Nguyen <shnguyen@pinterest.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
This includes a fix for ZOOKEEPER-2184 (Zookeeper Client
should re-resolve hosts when connection attempts fail), which
fixes KAFKA-4041.
Updated a couple of tests as unresolvable addresses are now
retried until the connection timeout. Cleaned up tests a little.
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
This setting allows specifying a chroot so we documented it.
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Katherine Farmer <kfarme3@uk.ibm.com>
Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>
Add some additional size validation to prevent overflows when using `FileRecords`.
Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>
If inter.broker.protocol.version is 2.0-IV1 or newer. Also fixed ListOffsetRequest
so that v2 is used, if applicable.
Added a unit test which verifies that we use the latest version of the various
requests by default. Included a few minor tweaks to make testing easier.
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
We need to use the same lock for metric update and read to avoid NPE and concurrent modification exceptions. Sensor add/remove/update are synchronized on Sensor since they access lists and maps that are not thread-safe. Reporters are notified of metrics add/remove while holding (Sensor, Metrics) locks and reporters may synchronize on the reporter lock. Metric read may be invoked by metrics reporters while holding a reporter lock. So read/update cannot be synchronized using Sensor since that could lead to deadlock. This PR introduces a new lock in Sensor for update/read.
Locking order:
- Sensor#add: Sensor -> Metrics -> MetricsReporter
- Metrics#removeSensor: Sensor -> Metrics -> MetricsReporter
- KafkaMetric#metricValue: MetricsReporter -> Sensor#metricLock
- Sensor#record: Sensor -> Sensor#metricLock
Reviewers: Jun Rao <junrao@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
`mina-core` is a transitive dependency for `apacheds` and `apacheda`. It is safer to use from `apacheds` since that is the implementation.
Reviewers: Ismael Juma <ismael@juma.me.uk>
#5253 broke standby restoration for windowed stores.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
1. extend isWindowStore to consider session store as well.
2. extend the existing unit test accordingly.
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
See also KIP-319.
Replace number-of-segments parameters with segment-interval-ms parameters in various places. The latter was always the parameter that several components needed, and we accidentally supplied the former because it was the one available.
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
ZooKeeper listener for change notifications should be created before loading the ACL cache to avoid timing window if acls are modified when broker is starting up.
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@confluent.io>
KAFKA-6986:Export Admin Client metrics through Stream Threads
We already exported producer and consumer metrics through KafkaStreams class:
#4998
It makes sense to also export the Admin client metrics.
I didn't add a separate unittest case for this. Let me know if it's needed.
This is my first contribution, feel free to point out any mistakes that I did.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
There are cases where the broker would return an unresolve-able address (e.g broker inside a docker network while client is outside) and the client would not log any information as to why it is timing out, since the default log level does not print DEBUG messages.
Changing this log level will enable easier troubleshooting in such circumstances. This change does not change the logs shown on transient failures like a broker failure.
Use KafkaPrincipal objects for authorization in `SimpleAclAuthorizer` so that comparison with super.users and ACLs instantiated from Strings work. Previously, it would compare two different classes `KafkaPrincipal` and the custom class, which would always return false because of the implementation of `KafkaPrincipal#equals`.
We need to ensure that the last poll time is always updated when the user call poll(Duration). This patch fixes a bug in the new KIP-266 timeout behavior which would cause this to be skipped if the coordinator could not be found while the consumer was in an active group.
Note that I've also fixed some type inconsistencies for various timeouts.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Currently create time is interpreted as integer.
This PR makes the tool accept long values.
Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
1. Rename metrics scope of rocksDB window and session stores; also modify the store metrics accordingly with guidance on its correlations to metricsScope.
2. Add the missing total metrics for per-thread, per-task, per-node and per-store sensors.
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Before KIP-266, consumer.poll(0) would call updateAssignmentMetadataIfNeeded(Long.MAX_VALUE), which makes sure that the rebalance is definitely completed, i.e. both onPartitionRevoked and onPartitionAssigned called within this poll(0). After KIP-266, however, it is possible that only onPartitionRevoked will be called if timeout is elapsed. And hence we need to double check that state is still PARTITIONS_ASSIGNED after the consumer.poll(duration) call.
Reviewers: Ted Yu <yuzhihong@gmail.com>, Matthias J. Sax <matthias@confluent.io>
Previously, the connection-creation metric only accounted for opened connections from the broker. This change extends it to account for received connections.
Do not update LogReadResult after it is initially populated when returning fetches immediately (i.e. without hitting the purgatory). This was done in #3954 as an optimization so that the followers get the potentially updated high watermark. However, since many things can happen (like deleting old segments and advancing log start offset) between initial creation of LogReadResult and the update, we can hit issues like log start offset in fetch response being higher than the last offset in fetched records.
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>