This PR implements KIP-78:Cluster Identifiers [(link)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id#KIP-78:ClusterId-Overview) and includes the following changes:
1. Changes to broker code
- generate cluster id and store it in Zookeeper
- update protocol to add cluster id to metadata request and response
- add ClusterResourceListener interface, ClusterResource class and ClusterMetadataListeners utility class
- send ClusterResource events to the metric reporters
2. Changes to client code
- update Cluster and Metadata code to support cluster id
- update clients for sending ClusterResource events to interceptors, (de)serializers and metric reporters
3. Integration tests for interceptors, (de)serializers and metric reporters for clients and for protocol changes and metric reporters for broker.
4. System tests for upgrading from previous versions.
Author: Sumit Arrawatia <sumit.arrawatia@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#1830 from arrawatia/kip-78
Invoke the statusListener.onFailure() callback on start failures so that the statusBackingStore is updated. This involved a fix to the putSafe() functionality which prevented any update that was not preceded by a (non-safe) put() from completing, so here when a connector or task is transitioning directly to FAILED.
Worker start methods can still throw if the same connector name or task ID is already registered with the worker, as this condition should not happen.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1778 from shikhar/distherder-stayup-take4
Add an optional configuration for the SecureRandom PRNG implementation, with the default behavior being the same (use the default implementation in the JDK/JRE).
Author: Todd Palino <Todd Palino>
Reviewers: Grant Henke <granthenke@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Jiangjie Qin <becket.qin@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
Closes#1747 from toddpalino/trunk
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1733 from ewencp/rest-api-retries
ijuma
As discussed in https://github.com/apache/kafka/pull/1645, this patch removes an extraneous line from several __init__.py files, and a few others as well
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#1659 from granders/minor-cleanup-init-files
Fix the test by using a more liberal timeout and forcing more frequent SinkTask.put() calls. Also add some logging to aid future debugging.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1663 from ewencp/kafka-3935-fix-restart-system-test
Without this file the benchmark does not run nightly.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1645 from enothereska/hotfix-streams-test
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Geoff Anderson, Guozhang Wang, Ismael Juma
Closes#1621 from enothereska/simple-benchmark-streams-system-tests
This fixes test_producer_throughput with compression_type=snappy.
Also: added heap dump on out of memory error to `producer_performance.py` and corrected the upgrade note related to the change in buffer size for compression streams.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1385 from ijuma/kafka-3713-test_producer_throughput-snappy-fail and squashes the following commits:
54c7962 [Ismael Juma] Correct upgrade note about buffer size for compression stream
515040b [Ismael Juma] Call `compressor.close()` to fix memory leak
5311e5b [Ismael Juma] Dump heap on out of memory error when running `producer_performance.py`
Author: Jason Gustafson <jason@confluent.io>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1365 from hachikuji/KAFKA-3694
Add a test for changing SASL mechanism using rolling upgrade and a test for rolling upgrade from 0.9.0.x to 0.10.0 with SASL/GSSAPI.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ben Stopford <benstopford@gmail.com>, Geoff Anderson <geoff@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1290 from rajinisivaram/KAFKA-3634
This patch adds logic for the following:
- remove hard-coded paths to various scripts and jars in kafkatest service classes
- provide a mechanism for overriding path resolution logic with a "pluggable" path resolver class
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1245 from granders/configurable-install-path
This actually removes joins altogether, as well as references to self.worker_threads, which is best left as an implementation detail in BackgroundThreadService.
This makes use of hachikuji 's recent ducktape patch, and updates ducktape dependency to 0.5.0.
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1297 from granders/KAFKA-3581-systest-add-join-timeout
A path was wrong in the script and in the documentation.
Author: Roger Hoover <roger.hoover@gmail.com>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1309 from theduderog/fix_aws_init
Recent patch adding enable-systest-events flag without any version check breaks all uses of versioned console consumer. E.g. upgrade tests, compatibility tests etc.
Added a check to only apply the flag if running 0.10.0 or greater.
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1298 from granders/minor-systest-fix-versioned-console-consumer
Even if a test calls stop() on console_consumer or verifiable_producer, it is still possible that producer/consumer will not shutdown cleanly, and will be killed forcefully after a timeout. It will be useful for some tests to know whether a clean shutdown happened or not. This PR adds methods to console_consumer and verifiable_producer to query whether clean shutdown happened or not.
hachikuji and/or granders Please review.
Author: Anna Povzner <anna@confluent.io>
Reviewers: Jason Gustafson, Geoff Anderson, Gwen Shapira
Closes#1278 from apovzner/kafka-3597
granders hachikuji Can you take a look when you have time? Appreciate your time to review.
Author: Liquan Pei <liquanpei@gmail.com>
Reviewers: Grant Henke <granthenke@gmail.com>, Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1263 from Ishiihara/classpath-no-test-jar
Run a sanity test with SASL/PLAIN and a couple of replication tests with SASL/PLAIN and multiple mechanisms.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1282 from rajinisivaram/KAFKA-2693
* Use a fixed `Random` seed in `EndToEndLatency.scala` for determinism
* Add `compression_type` to and remove `consumer_fetch_max_wait` from `end_to_end_latency.py`. The latter was never used.
* Tweak logging of `end_to_end_latency.py` to be similar to `consumer_performance.py`.
* Add `compression_type` to `benchmark_test.py` methods and add `snappy` to `matrix` annotation
* Use randomly generated bytes from a restricted range for `ProducerPerformance` payload. This is a simple fix for now. It can be improved in the PR for KAFKA-3554.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1225 from ijuma/kafka-3558-add-compression_type-benchmark_test.py
Author: Ismael Juma <ismael@juma.me.uk>
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1173 from ijuma/kafka-3490-multiple-version-support-perf-tests
This PR: https://github.com/apache/kafka/pull/958 fixed the use of prop_file in the situation when we have multiple producers (before, every producer will add to the config). However, it assumes that self.prop_file is initially "". This is correct for all existing tests, but it precludes us from extending verifiable producer and adding more properties to the producer config (same as console consumer).
This is a small PR to change the behavior to the original, but also make verifiable producer use prop_file method to be consistent with console consumer.
Also few more fixes to verifiable producer came up during the review:
-- fixed each_produced_at_least() method
-- more straightforward use of compression types
granders please review.
Author: Anna Povzner <anna@confluent.io>
Reviewers: Geoff Anderson <geoff@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1192 from apovzner/fix_verifiable_producer
This also fixes KAFKA-3453 and KAFKA-2866.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1155 from ijuma/kafka-3475-introduce-our-minikdc
Previous version of ducktape was found to have a memory leak which caused occasional failures in nightly runs.
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1165 from granders/minor-advance-ducktape-to-0.4.0
Note: This goes only to trunk. 0.10.0 branch will need a separate PR with different versions.
Author: Gwen Shapira <cshapi@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1109 from gwenshap/minor-fix-version-trunk
The main impediment to bringing up aws machines in parallel using vagrant was the interaction between `vagrant-hostmanager` and `vagrant-aws`. If you disable hostmanager during the `up` phase, and run it after the cluster is up, parallel bringup is possible. The only caveat is that machines must be brought up in small-ish batches to prevent rate limit errors from AWS since `vagrant-aws` doesn't seem to have mechanisms to
This PR:
- disables `vagrant-hostmanager` during bringup
- adds a wrapper script to make it convenient to bring machines up in batches on aws
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#982 from granders/vagrant-disable-hostmanager
ewencp gwenshap granders could you have a look please? Thanks.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confuent.io>
Closes#1096 from enothereska/systest-hotfix-name
becketqin apovzner please have a look. becketqin the test fails when the producer and consumer are 0.9.x and the message format changes on the fly.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ewen Cheslack-Postava, Ismael Juma, Gwen Shapira
Closes#1070 from enothereska/kafka-3202-format-change-fly
apovzner becketqin please have a look if you can. Thanks.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Anna Povzner, Gwen Shapira
Closes#1059 from enothereska/kafka-3188-compatibility