Author: Nafer Sanabria <nafr.snabr@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1595 from naferx/minor-typo
Interval lengths for ConsumerPerformance could sometime be calculated as zero. In such cases, when the bytes read or messages read are also zero a NaN output is returned for mbRead per second or for nMsg per second, whereas zero would be a more appropriate output.
In cases where interval length is zero but there have been data and messages to read, an output of Infinity is returned, as expected.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Dong Lin <lindong28@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#788 from vahidhashemian/KAFKA-3111
There seems to be a bug in the JDK that on some versions the mtime of
the file is modified on FileChannel.truncate() even if the javadoc states
`If the given size is greater than or equal to the file's current size then
the file is not modified.`.
This causes problems with log retention, as all the files then look like
they contain recent data to Kafka. Therefore this is only done if the channel size is different to the target size.
Author: Moritz Siuts <m.siuts@emetriq.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1497 from msiuts/KAFKA-3802-log_mtimes_reset_on_broker_shutdown
This patch fixes two issues:
1. Subsequent regex subscriptions fail with the new consumer.
2. Subsequent regex subscriptions would not immediately refresh metadata to change the subscription of the new consumer and trigger a rebalance.
The final note on the JIRA stating that a later created topic that matches a consumer's subscription pattern would not be assigned to the consumer upon creation seems to be as designed. A repeat
`subscribe()` to the same pattern or some wait time until the next automatic metadata refresh would handle that case.
An integration test was also added to verify these issues are fixed with this PR.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1572 from vahidhashemian/KAFKA-3854
The reasons to remove it are:
1. It's currently broken. The purpose of the [JIRA](https://issues.apache.org/jira/browse/KAFKA-3761) was to report that the RunningAsController state gets overwritten back to "RunningAsBroker".
2. It's not a useful state.
a. If clients want to use this metric to know whether a broker is ready to receive requests or not, they do not care whether or not the broker is the controller
b. there is already a separate boolean property, KafkaController.isActive which contains this information.
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1437 from theduderog/KAFKA-3761-broker-state
The failure could manifest itself if the default metrics registry had some entries from other tests:
`java.lang.AssertionError: Unexpected meter count expected:<0> but was:<3>`
I also removed an unused variable and improved the error message to include the metric name.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1544 from ijuma/fix-transient-session-expire-listener-metrics-failure
Co-authored with ijuma.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1536 from vahidhashemian/minor/KAFKA-3176-Followup
With this pull request the new console consumer can be provided with optional --partition and --offset arguments so only messages from a particular partition and starting from a particular offset are consumed.
The following rules are also implemented to avoid invalid combinations of arguments:
- If --partition or --offset is provided --new-consumer has to be provided too.
- If --partition is provided --topic has to be provided too.
- If --offset is provided --partition has to be provided too.
- --offset and --from-beginning cannot be used at the same time.
This patch is co-authored with rajinisivaram.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#922 from vahidhashemian/KAFKA-3176
From the kafka-dev mailing list discussion: [[DISCUSS] scalability limits in the coordinator](http://mail-archives.apache.org/mod_mbox/kafka-dev/201605.mbox/%3CCAMQuQBZDdtAdhcgL6h4SmTgO83UQt4s72gc03B3VFghnME3FTAmail.gmail.com%3E)
There's a scalability limit on the new consumer / coordinator regarding the amount of group metadata we can fit into one message. This restricts a combination of consumer group size, topic subscription sizes, topic assignment sizes, and any remaining member metadata.
Under more strenuous use cases like mirroring clusters with thousands of topics, this limitation can be reached even after applying gzip to the __consumer_offsets topic.
Various options were proposed in the discussion:
1. Config change: reduce the number of consumers in the group. This isn't always a realistic answer in more strenuous use cases like MirrorMaker clusters or for auditing.
2. Config change: split the group into smaller groups which together will get full coverage of the topics. This gives each group member a smaller subscription.(ex: g1 has topics starting with a-m while g2 has topics starting with n-z). This would be operationally painful to manage.
3. Config change: split the topics among members of the group. Again this gives each group member a smaller subscription. This would also be operationally painful to manage.
4. Config change: bump up KafkaConfig.messageMaxBytes (a topic-level config) and KafkaConfig.replicaFetchMaxBytes (a broker-level config). Applying messageMaxBytes to just the __consumer_offsets topic seems relatively harmless, but bumping up the broker-level replicaFetchMaxBytes would probably need more attention.
5. Config change: try different compression codecs. Based on 2 minutes of googling, it seems like lz4 and snappy are faster than gzip but have worse compression, so this probably won't help.
6. Implementation change: support sending the regex over the wire instead of the fully expanded topic subscriptions. I think people said in the past that different languages have subtle differences in regex, so this doesn't play nicely with cross-language groups.
7. Implementation change: maybe we can reverse the mapping? Instead of mapping from member to subscriptions, we can map a subscription to a list of members.
8. Implementation change: maybe we can try to break apart the subscription and assignments from the same SyncGroupRequest into multiple records? They can still go to the same message set and get appended together. This way the limit become the segment size, which shouldn't be a problem. This can be tricky to get right because we're currently keying these messages on the group, so I think records from the same rebalance might accidentally compact one another, but my understanding of compaction isn't that great.
9. Implementation change: try to apply some tricks on the assignment serialization to make it smaller.
10. Config and Implementation change: bump up the __consumer_offsets topic messageMaxBytes and (from Jun Rao) fix how we deal with the case when a message is larger than the fetch size. Today, if the fetch size is smaller than the fetch size, the consumer will get stuck. Instead, we can simply return the full message if it's larger than the fetch size w/o requiring the consumer to manually adjust the fetch size.
11. Config and Implementation change: same as above but only apply the special fetch logic when fetching from internal topics
This PR provides an implementation of option 11.
That being said, I'm not very happy with this approach as it essentially doesn't honor the "replica.fetch.max.bytes" config. Better alternatives are definitely welcome!
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jiangjie Qin <becket.qin@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1484 from onurkaraman/KAFKA-3810
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1513 from hachikuji/followup-for-kafka-2720
Adding an error logging message in Log.loadSegments() in the case when an index file corresponding to a log file exists but an exception is thrown.
Signed-off-by: Ishita Mandhan <imandhaus.ibm.com>
Author: Ishita Mandhan <imandha@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1480 from imandhan/KAFKA-3762
The `partition` argument is not marked as required, and has a default of `0`, according to the tool's help message. However, if `partition` is not provided the command returns with `Missing required argument "[partition]"`. This patch is to fix the required arguments of the tool by removing `partition` from them.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1495 from vahidhashemian/minor/simple_consumer_shell_update_required_args
ijuma harshach edoardocomar Can you please review the changes.
edoardocomar I have addressed your comment of extra space.
Author: Bharat Viswanadham <bharatv@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1474 from bharatviswa504/Kafka-3748
- replace `System.exit(1)` with a regular `return` in order to release the latch blocking the shutdown hook thread from shutting down the JVM
- provide `PrintStream` to the `process` method in order to ease unit testing
Author: Sebastien Launay <sebastien@opendns.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1185 from slaunay/bugfix/KAFKA-3501-console-consumer-hangs
- ZkClient is used for conditional path deletion and wraps `KeeperException.BadVersionException` into `ZkBadVersionException`
- add unit test to `SimpleAclAuthorizerTest` to reproduce the issue and catch potential future regression
Author: Sebastien Launay <sebastien@opendns.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1461 from slaunay/bugfix/KAFKA-3783-zk-conditional-delete-path
If socket.receive.buffer.bytes/socket.send.buffer.bytes are set to -1, use the OS defaults.
Author: Joshi <rekhajoshm@gmail.com>
Author: Rekha Joshi <rekhajoshm@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1469 from rekhajoshm/KAFKA-724-rebased
- Used flatMap instead of map and flatten
- Use isEmpty, NonEmpty, isDefined as appropriate
- Used head, keys and keySet where appropriate
- Used contains, diff and find where appropriate
- Removed redundant val modifier for case class constructor
- toString has no parameters, no side effect hence without () consistent usage
- Removed unnecessary return , parentheses and semi colons.
Author: Joshi <rekhajoshm@gmail.com>
Author: Rekha Joshi <rekhajoshm@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1451 from rekhajoshm/KAFKA-3771
Included a couple of clean-ups: removed unused variable and the instantiated `KafkaProducer` is now closed.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#1470 from ijuma/move-console-producer-test-to-unit-folder
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1471 from ijuma/fix-leaking-producers-in-plaintext-producer-send-test
The timestamp of messages consumed by mirror maker is not preserved after sending to target cluster. The correct behavior is to keep create timestamp the same in both source and target clusters.
Author: Tao Xiao <xiaotao183@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1466 from xiaotao183/KAFKA-3787
The host and port entries under /brokers/ids/<bid> gets filled only for PLAINTEXT security protocol. For other protocols the host is null and the actual endpoint is under "endpoints". This causes NPE when running the consumer group and offset checker scripts in a kerberized env. By always reading the host and port values from the "endpoint", a more meaningful exception would be thrown rather than a NPE.
Author: Arun Mahadevan <aiyer@hortonworks.com>
Reviewers: Sriharsha Chintalapani <schintalapani@hortonworks.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1301 from arunmahadevan/cg_kerb_fix
Set OffsetsTopicReplicationFactorProp to 3 like MinInSyncReplicasProp Else a consumer was able to consume via assign but not via subscribe, so the testProduceAndConsume is now duplicated to check both paths
Author: Edoardo Comar <ecomar@uk.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1425 from edoardocomar/KAFKA-3728
I found both issues while investigating the issue described in PR #1425.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Sriharsha Chintalapani <schintalapani@hortonworks.com>, Jun Rao <junrao@gmail.com>
Closes#1455 from ijuma/fix-integration-test-harness-and-zk-test-harness
Since the 'time' argument has a default value of -1, it makes sense to make it an optional argument.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1457 from vahidhashemian/KAFKA-3774
`numAcks` is only used in the `trace` logging statement so it should be a `def` instead of a `val`. Also took the chance to improve the code and documentation a little.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1449 from ijuma/minor-avoid-trace-logging-computation-in-partition
This patch fix differentiates between when a consumer group is rebalancing or dead and reports the appropriate error message.
Author: Ishita Mandhan <imandha@us.ibm.com>
Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1429 from imandhan/KAFKA-3158
SkimpyOffsetMap.get() when full
Limited number of attempts to number of map slots after the internal
positionOf() goes into linear search mode.
Added unit test
Co-developed with mimaison
Author: edoardo <ecomar@uk.ibm.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1352 from edoardocomar/KAFKA-3682
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1440 from vahidhashemian/typo06/fix_typos_in_code_comments
requestObj() returns null for the o.a.k.c.requests objects so use header() for these.
Once all the requests will have been replaced by o.a.k.c.requests objects, we should be able to clean that up, but in the meantime it's useful to trace both.
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1435 from mimaison/kafkaapis_trace
To be consistent with `ConfigCommand` and `TopicCommand`.
No release includes this option yet, so we can simply change it.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Mickael Maison, Grant Henke
Closes#1430 from ijuma/use-force-instead-of-yes-in-acl-command and squashes the following commits:
bdf3a57 [Ismael Juma] Update `AclCommandTest`
78b8467 [Ismael Juma] Change variable name to `forceOpt`
0bb27af [Ismael Juma] Use `--force` instead of `--yes` in `AclCommand`
junrao
Currently, the broker state is set to running before it registers itself in ZooKeeper. This is too early in the broker lifecycle. If clients use the broker state as an indicator that the broker is ready to accept requests, they will get errors. This change is to delay setting the broker state to running until it's registered in ZK.
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1426 from theduderog/broker-running-after-zk
Kafka has two configurable compression codecs: the one used by the client (source codec) and the one finally used when storing into the log (target codec). The target codec defaults to KafkaConfig.compressionType and can be dynamically configured through zookeeper.
The GroupCoordinator appends group membership information into the __consumer_offsets topic by:
1. making a message with group membership information
2. making a MessageSet with the single message compressed with the source codec
3. doing a log.append on the MessageSet
Without this patch, KafkaConfig.offsetsTopicCompressionCodec doesn't get propagated to OffsetConfig instantiation, so GroupMetadataManager uses a source codec of NoCompressionCodec when making the MessageSet. Let's say we have enough group information such that the message formed exceeds KafkaConfig.messageMaxBytes before compression but would fall below the threshold after compression using our source codec. Even if we had dynamically configured __consumer_offsets with our favorite compression codec, the log.append will throw RecordTooLargeException during analyzeAndValidateMessageSet since the message was unexpectedly uncompressed instead of having been compressed with the source codec defined by KafkaConfig.offsetsTopicCompressionCodec.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1394 from onurkaraman/KAFKA-3718
Added a new argument to AclCommand: --yes. When set, automatically answer yes to prompts
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Gwen Shapira
Closes#1406 from mimaison/KAFKA-3732
We want to phase out `ByteBufferMessageSet` eventually, so new code should favour `Record` where possible.
Also use a fixed timestamp in `testCorruptLz4ProduceRequest` to ensure that
the checksum is always the same.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson, Guozhang Wang
Closes#1357 from ijuma/produce-request-test-improvement
Delete per-topic metrics when there are no replicas of any partitions of the topic on a broker.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Joel Koshy <jjkoshy.w@gmail.com>, Manikumar reddy O <manikumar.reddy@gmail.com>, Ashish Singh <asingh@cloudera.com, Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#944 from rajinisivaram/KAFKA-3258
This PR is to fix synchronization issue between deleteOldSegments() and delete() method calls. log.deleteOldSegments() call throws NullPointerException after log.delete() method call.
cc ijuma junrao
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1367 from omkreddy/KAFKA-3584
ZkUtils.getSequenceId() method is used to generate broker id sequence numbers. During startup, each broker updates the data at /brokers/seqid zk path and returns stat.getVersion as next sequence id.
stat.getVersion returns "1" for first data update. So ZkUtils.getSequenceId() should return "1" on first data update.
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Flavio Junqueira <fpj@apache.org>, Ismael Juma <ismael@juma.me.uk>
Closes#1224 from omkreddy/KAFKA-3525
We have had transient failures in this method when Jenkins is
overloaded.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1359 from ijuma/increase-producer-close-timeout-in-test
After testing KAFKA-3160 a bit more, I found that the error code was not being set properly in ProduceResponse. This happened because the validation error is raised in the CompressionFactory constructor, which was not wrapped in a try / catch.
ijuma junrao
(This contribution is my original work and I license the work under Apache 2.0.)
Author: Dana Powers <dana.powers@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Gwen Shapira <cshapi@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1344 from dpkp/decompress_error_code
Removed the over pessimistic require and instead attempt to fill the dedup buffer. Use the (only) map until full;
this may allow to process all dirty segment (optimism) or may happen in the middle of a dirt segment.
In either case, do compaction using the map loaded that way.
This patch was developed with edoardocomar
Author: Mickael Maison <mickael.maison@gmail.com>
Reviewers: Jun Rao <junrao@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
Closes#1332 from mimaison/KAFKA-3587
Test sends large request using multiple writes of length followed by request body. The first write should succeed, but since the server closes the connection on processing the length that is too big, subsequent writes may fail. Modified test to handle this exception.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1349 from rajinisivaram/KAFKA-3662
Add references to the new consumer property 'max.partition.fetch.bytes' along with the old consumer property 'fetch.message.max.bytes' in the corresponding warning messages of TopicCommand.
Also, create and leverage a static variable for the default value of the new consumer property.
Also, use 'DEFAULT_...' for default propoerty constant names in the code instead of '..._DEFAULT'.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Manikumar reddy O <manikumar.reddy@gmail.com>, Ashish Singh <asingh@cloudera.com>, Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1239 from vahidhashemian/KAFKA-3579
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1338 from ijuma/kafka-3670-controlled-shutdown-leader-selector-preferred-replica