src-kafka

Commit Graph

Author	SHA1	Message	Date
Lucas Wang	685fd03dda	KAFKA-6612: Added logic to prevent increasing partition counts during topic deletion This patch adds logic in handling the PartitionModifications event, so that if the partition count is increased when a topic deletion is still in progress, the controller will restore the data of the path /brokers/topics/"topic" to remove the added partitions. Testing done: Added a new test method to cover the bug Author: Lucas Wang <luwang@linkedin.com> Reviewers: Jiangjie (Becket) Qin <becket.qin@gmail.com> Closes #4666 from gitlw/prevent_increasing_partition_count_during_topic_deletion	7 years ago
Rajini Sivaram	2307314432	MINOR: Fix encoder config to make DynamicBrokerReconfigurationTest stable (#4764 ) DynamicBrokerReconfigurationTest currently assumes that passwords encoded with one secret will fail with an exception if decoded with another secret and configures an old.secret in setUp. This could potentially cause test failures if a password was incorrectly decoded with the wrong secret, since the test writes passwords encoded with the new secret directly to ZooKeeper. Since old.secret is only used in one test for verifying secret rotation, this config can be moved to that test to avoid transient failures. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Jason Gustafson	fcf8781602	KAFKA-6683; Ensure producer state not mutated prior to append (#4755 ) We were unintentionally mutating the cached queue of batches prior to appending to the log. This could have several bad consequences if the append ultimately failed or was truncated. In the reporter's case, it caused the snapshot to be invalid after a segment roll. The snapshot contained producer state at offsets higher than the snapshot offset. If we ever had to load from that snapshot, the state was left inconsistent, which led to an error that ultimately crashed the replica fetcher. The fix required some refactoring to avoid sharing the same underlying queue inside ProducerAppendInfo. I have added test cases which reproduce the invalid snapshot state. I have also made an effort to clean up logging since it was not easy to track this problem down. One final note: I have removed the duplicate check inside ProducerStateManager since it was both redundant and incorrect. The redundancy was in the checking of the cached batches: we already check these in Log.analyzeAndValidateProducerState. The incorrectness was the handling of sequence number overflow: we were only handling one very specific case of overflow, but others would have resulted in an invalid assertion. Instead, we now throw OutOfOrderSequenceException. Reviewers: Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>	7 years ago
Guozhang Wang	f2fbfaaccc	KAFKA-6611: PART I, Use JMXTool in SimpleBenchmark (#4650 ) 1. Use JmxMixin for SimpleBenchmark (will remove the self reporting in #4744), only when loading phase is false (i.e. we are in fact starting the streams app). 2. Reported the full jmx reported metrics in log files, and in the returned data only return the max values: this is because we want to skip the warming up and cooling down periods that will have lower rate numbers, while max represents the actual rate at full speed. 3. Incorporates two other improves to JMXTool: #1241 and #2950 Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Rohan Desai <desai.p.rohan@gmail.com>	7 years ago
Rajini Sivaram	57b1c28d60	MINOR: Fix AdminClient.describeConfigs() of listener configs (#4747 ) Don't return config values from `describeConfigs` if the config type cannot be determined. Obtain config types correctly for listener configs for `describeConfigs` and password encryption. Reviewers: Jason Gustafson <jason@confluent.io>	7 years ago
Matthias J. Sax	f0a29a6935	MINOR: remove obsolete warning in StreamsResetter (#4749 ) Reviewer: Guozhang Wang <guozhang@confluent.io>	7 years ago
Rajini Sivaram	2f90cb86c1	MINOR: Remove acceptor creation in network thread update code (#4742 ) Fix dynamic addition of network threads to only create new Processor threads and not the Acceptor.	7 years ago
Colin Patrick McCabe	f5287ccad2	MINOR: Fix flaky TestUtils functions (#4743 ) TestUtils#produceMessages should always close the KafkaProducer, even when there is an exception. Otherwise, the test will leak threads when there is an error. TestUtils#createNewProducer should create a producer with a requestTimeoutMs of 30 seconds by default, not around 10 seconds. This should avoid tests that flake when the load on Jenkins climbs. Fix two cases where a very short timeout of 2 seconds was getting set. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Manikumar Reddy O	aefe35e493	KAFKA-6680: Fix issues related to Dynamic Broker configs (#4731 ) - Fix kafkaConfig initialization if there are no dynamic configs defined in ZK. - Update DynamicListenerConfig.validateReconfiguration() to check new Listeners must be subset of listener map Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	7 years ago
Ismael Juma	6fab286da2	MINOR: Fix some compiler warnings (#4726 )	7 years ago
Jason Gustafson	7041e76bd6	MINOR: Some logging improvements for debugging delayed produce status (#4691 ) A few small logging improvements which help debugging replication issues.	7 years ago
Dong Lin	4391a4214d	MINOR: Use log start offset as high watermark if current value is out of range (#4722 ) Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	7 years ago
Dhruvil Shah	ae31ee63dc	KAFKA-6530: Use actual first offset of message set when rolling log segment (#4660 ) Use the exact first offset of message set when rolling log segment. This is possible to do for message format V2 and beyond without any performance penalty, because we have the first offset stored in the header. This augments the fix made in KAFKA-4451 to avoid using the heuristic for V2 and beyond messages. Added unit tests to simulate cases where segment needs to roll because of overflow in index offsets. Verified that the new segment created in these cases uses the first offset, instead of the heuristic in use previously.	7 years ago
Jason Gustafson	9782465d6f	KAFKA-6672; ConfigCommand should create config change parent path if needed (#4727 ) Change `KafkaZkClient.createConfigChangeNotification` to ensure creation of the change directory. This fixes failing system tests which depend on setting SCRAM credentials prior to broker startup. Existing test case has been modified for new expected usage. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Jason Gustafson	372b4c6a77	KAFKA-6656; Config tool should return non-zero status code on failure (#4711 ) Prior to this patch, we caught some exceptions when executing the command, which meant that it would return with status code zero. This patch fixes this and makes the expected exit behavior explicit. Test cases have been added to verify the change. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Rajini Sivaram	5cdb951091	KAFKA-6653; Complete delayed operations even when there is lock contention (#4704 ) If there is lock contention while multiple threads check if a delayed operation may be completed (e.g. a produce request with acks=-1), the threads perform completion only if the lock is free, to avoid deadlocks. This leaves a timing window when an operation becomes ready to complete after another thread has acquired the lock and performed the check for completion, but not yet released the lock. The PR adds an additional flag to ensure that the operation is completed in this case.	7 years ago
Dong Lin	6b08905dfb	KAFKA-3978; Ensure high watermark is always positive (#4695 ) Partition high watermark may become -1 if the initial value is out of range. This situation can occur during partition reassignment, for example. The bug was fixed and validated with unit test in this patch. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	7 years ago
Manikumar Reddy O	ad355298c6	MINOR: Remove unused server exceptions (#4701 )	7 years ago
Rajini Sivaram	2bf06890b9	MINOR: Use large batches in metrics test for conversion time >= 1ms (#4681 )	7 years ago
Dong Lin	1ea07b993d	KAFKA-6624; Prevent concurrent log flush and log deletion (#4663 ) KAFKA-6624; Prevent concurrent log flush and log deletion Reviewers: Ted Yu <yuzhihong@gmail.com>, Jun Rao <junrao@gmail.com>	7 years ago
Ismael Juma	825bfe5ade	MINOR: Revert to ZooKeeper 3.4.10 due to ZOOKEEPER-2960 (#4678 ) It's a critical bug that only affects the server, but we don't have an easy way to use 3.4.11 for the client only. Reviewers: Jun Rao <junrao@gmail.com>, Damian Guy <damian.guy@gmail.com>	7 years ago
Rajini Sivaram	8df96a4119	MINOR: Reduce ZK reads and ensure ZK watch is set for listener update (#4670 ) Ensures that ZK watch is set for each live broker for listener update notifications in the controller. Also avoids reading all brokers from ZooKeeper when a broker metadata is modified by passing in brokerId to BrokerModifications and reading only the updated broker. The existing listener update test verifies both these changes. Earlier, the test did not detect missing watch for the last broker since metadata of all brokers were read from ZK (adding a watch for all) when any broker was updated. Reviewers: Jun Rao <junrao@gmail.com>	7 years ago
Rajini Sivaram	8d4d5f8c9f	MINOR: Fix deadlock in ZooKeeperClient.close() on session expiry (#4672 ) Reviewers: Jun Rao <junrao@gmail.com>	7 years ago
Radai Rosenblatt	5760da7d0b	KAFKA-6622; Fix performance issue loading consumer offsets (#4661 ) `batch.baseOffset` is an expensive operation (even says so in its javadoc), and yet was called for every single record in a batch when loading offsets. This means that for N records in a gzipped batch, the entire batch will be unzipped N times. The fix is to compute and cache the base offset once as we decompress and process the batch. Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	7 years ago
Rajini Sivaram	3ef2fb843e	MINOR: Fix record conversion time in metrics (#4671 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Jimin Hsieh	db619c6da2	MINOR: Remove unused local variable in SocketServer (#4669 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Ewen Cheslack-Postava	d13cbd0cae	KAFKA-3806: Increase offsets retention default to 7 days (KIP-186) (#4648 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	7 years ago
Sandor Murakozi	9868976747	KAFKA-6111: Improve test coverage of KafkaZkClient, fix bugs found by new tests;	7 years ago
Viktor Somogyi	9166ae4ec4	MINOR: Remove unnecessary semicolon in ResourceType (#4626 )	7 years ago
huxi	2591a46930	KAFKA-5327; Console Consumer should not commit messages not printed (#4546 ) Ensure that the consumer's offsets are reset prior to closing so that any buffered messages which haven't been printed are not committed.	7 years ago
Gilles Degols	3600316a14	KAFKA-6057: Users forget `--execute` in the offset reset tool (#4069 ) Add a small warning note when the user does not use the --execute flag.	7 years ago
Matthias J. Sax	5df535e8a3	MINOR: fixes lgtm.com warnings (#4582 ) fixes lgmt.com warnings cleanup PrintForeachAction and Printed Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Sebastian Bauersfeld <sebastianbauersfeld@gmx.de>, Damian Guy <damian@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Jason Gustafson	e26d0d7604	MINOR: Revert incompatible behavior change to consumer reset tool (#4611 ) This patch reverts the removal of the --execute option in the offset reset tool and the change to the default behavior when no options were present. For consistency, this patch adds the --execute flag to the streams reset tool, but keeps its current default behavior. A note has been added to both of these commands to warn the user that future default behavior will be to prompt before acting. Test cases were not actually validating that offsets were committed when the --execute option was present, so I have fixed that and added basic assertions for the dry-run behavior. I also removed some duplicated test boilerplate. Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Vahid Hashemian	1cabef0d3d	MINOR: Refactor GroupMetadataManager cleanupGroupMetadata (#4504 ) Refactoring avoids the need to call this method with a infinity as current time to remove all group offsets (when manually deleting the group).	7 years ago
Rajini Sivaram	90e0bbec94	KAFKA-6573: Update brokerInfo in KafkaController on listener update (#4603 ) Update `KafkaController.brokerInfo` when listeners are updated since this value is used to register the broker in ZooKeeper if there ZK session expires. Also added test to verify values in ZK after session expiry.	7 years ago
Jason Gustafson	660c0c0aa3	KAFKA-6238; Fix inter-broker protocol message format compatibility check This patch fixes a bug in the validation of the inter-broker protocol and the message format version. We should allow the configured message format api version to be greater than the inter-broker protocol api version as long as the actual message format versions are equal. For example, if the message format version is set to 1.0, it is fine for the inter-broker protocol version to be 0.11.0 because they both use message format v2. I have added a unit test which checks compatibility for all combinations of the message format version and the inter-broker protocol version. Author: Jason Gustafson <jason@confluent.io> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes #4583 from hachikuji/KAFKA-6328-REOPENED	7 years ago
Ismael Juma	eaafbdecb5	MINOR: Fix logger name override (#4600 ) This regressed during the log4j -> scalalogging change. Added unit tests, one of which failed before the fix.	7 years ago
Jiangjie (Becket) Qin	a413733094	KAFKA-6568: Minor clean-ups follow-up (#4592 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Jason Gustafson	1547cf6de8	KAFKA-6554; Missing lastOffsetDelta validation before log append (#4585 ) Add validation checks that the offset range is valid and aligned with the batch count prior to appending to the log. Several unit tests have been added to verify the various invalid cases.	7 years ago
Jiangjie (Becket) Qin	7c09b9410a	KAFKA-6568; Log cleaner should check partition state before removal from inProgress map (#4580 ) The log cleaner should not naively remove the partition from in progress map without checking the partition state. This may cause the other thread calling `LogCleanerManager.abortAndPauseCleaning()` to hang indefinitely.	7 years ago
Rajini Sivaram	b20639db44	MINOR: Enable deep-iteration to print data in DumpLogSegments (#4396 ) Enable deep-iteration option when print-data-log is enabled in DumpLogSegments. Otherwise data is not printed. Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>	7 years ago
Rajini Sivaram	c72062ddc0	KAFKA-6517: Avoid deadlock in ZooKeeperClient during session expiry (#4551 ) ZooKeeperClient acquires initializationLock#writeLock to establish a new connection while processing session expiry WatchEvent. ZooKeeperClient#handleRequests acquires initializationLock#readLock, allowing multiple batches of requests to be processed concurrently, but preventing reconnections while processing requests. At the moment, handleRequests holds onto the readLock throughout the method, even while waiting for responses and inflight requests to complete. But responses cannot be delivered if event thread is blocked on the writeLock to process session expiry event. This results in a deadlock. During broker shutdown, the shutdown thread is also blocked since it needs the readLock to perform ZooKeeperClient#unregisterStateChangeHandler, which cannot be acquired if a session expiry had occurred earlier since this thread gets queued behind the event handler thread waiting for writeLock. This commit reduces locking in ZooKeeperClient#handleRequests to just the non-blocking send, so that session expiry handling doesn't get blocked when a send is blocked waiting for responses. Also moves session expiry handling to a separate thread so that Kafka controller doesn't block the event handler thread when processing session expiry.	7 years ago
Manikumar Reddy O	d9d0d79287	MINOR: Update test classes to use KafkaZkClient methods (#4367 ) Remove ZkUtils reference form ZooKeeperTestHarness plus some minor cleanups.	7 years ago
Lucas Wang	88307657df	KAFKA-6481; Improving performance of the function ControllerChannelManager.addUpd… …ateMetadataRequestForBrokers More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Author: Lucas Wang <luwang@linkedin.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #4472 from gitlw/improving_addUpdateMetadataRequestForBrokers	7 years ago
Rajini Sivaram	38e9958d6e	KAFKA-6476: Documentation for dynamic broker configuration (#4558 ) Docs for dynamic broker configuration (KIP-226)	7 years ago
Jason Gustafson	6d18d882b8	KAFKA-6397: Consumer should not block setting positions of unavailable partitions (#4557 ) Prior to this patch, the consumer always blocks in poll() if there are any partitions which are awaiting their initial positions. This behavior was inconsistent with normal fetch behavior since we allow fetching on available partitions even if one or more of the assigned partitions becomes unavailable _after_ initial offset lookup. With this patch, the consumer will do offset resets asynchronously, which allows other partitions to make progress even if the initial positions for some partitions cannot be found. I have added several new unit tests in `FetcherTest` and `KafkaConsumerTest` to verify the new behavior. One minor compatibility implication worth mentioning is apparent from the change I made in `DynamicBrokerReconfigurationTest`. Previously it was possible to assume that all partitions had a fetch position after `poll()` completed with a non-empty assignment. This assumption is no longer generally true, but you can force the positions to be updated using the `position()` API which still blocks indefinitely until a position is available. Note that this this patch also removes the logic to cache committed offsets in `SubscriptionState` since it was no longer needed (the consumer's `committed()` API always does an offset lookup anyway). In addition to avoiding the complexity of maintaining the cache, this avoids wasteful offset lookups to refresh the cache when `commitAsync()` is used. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	7 years ago
Guozhang Wang	f3a3253e24	HOTFIX: Fix reset integration test hangs on busy wait (#4491 ) * do not use static properties * use new object to take appID * capture timeout exception inside condition Reviewers: Matthias J. Sax <matthias@confluent.io>	7 years ago
lisa2lisa	c0d579ca12	MINOR: fix inconsistance in LogCleaner javadoc (#4027 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
xin	98959a1266	KAFKA-6552: Use “entity_type” in description of kafka-configs.sh (#4556 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
parafiend	fc56a90e05	KAFKA-6529: Stop file descriptor leak when client disconnects with staged receives (#4517 ) If an exception is encountered while sending data to a client connection, that connection is disconnected. If there are staged receives for that connection, they are tracked to process those records. However, if the exception was encountered during processing a `RequestChannel.Request`, the `KafkaChannel` for that connection is muted and won't be processed. Disable processing of outstanding staged receives if a send fails. This stops the leak of the memory for pending requests and the file descriptor of the TCP socket. Test that a channel is closed when an exception is raised while writing to a socket that has been closed by the client. Since sending a response requires acks != 0, allow specifying the required acks for test requests in SocketServerTest.scala. Author: Graham Campbell <graham.campbell@salesforce.com> Reviewers: Jason Gustafson <jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>, Ted Yu <yuzhihong@gmail.com>	7 years ago

... 3 4 5 6 7 ...

2341 Commits (9dac615d228c5b3464c6322aea9f9ce70f9ef37b)