src-kafka

Commit Graph

Author	SHA1	Message	Date
Ismael Juma	825bfe5ade	MINOR: Revert to ZooKeeper 3.4.10 due to ZOOKEEPER-2960 (#4678 ) It's a critical bug that only affects the server, but we don't have an easy way to use 3.4.11 for the client only. Reviewers: Jun Rao <junrao@gmail.com>, Damian Guy <damian.guy@gmail.com>	7 years ago
Jacek Laskowski	c28c556e92	MINOR: Remove code duplication + excessive space (#4683 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Rajini Sivaram	8df96a4119	MINOR: Reduce ZK reads and ensure ZK watch is set for listener update (#4670 ) Ensures that ZK watch is set for each live broker for listener update notifications in the controller. Also avoids reading all brokers from ZooKeeper when a broker metadata is modified by passing in brokerId to BrokerModifications and reading only the updated broker. The existing listener update test verifies both these changes. Earlier, the test did not detect missing watch for the last broker since metadata of all brokers were read from ZK (adding a watch for all) when any broker was updated. Reviewers: Jun Rao <junrao@gmail.com>	7 years ago
Rajini Sivaram	8d4d5f8c9f	MINOR: Fix deadlock in ZooKeeperClient.close() on session expiry (#4672 ) Reviewers: Jun Rao <junrao@gmail.com>	7 years ago
Max Zheng	fd015b401d	MINOR: Tag AWS instances with Jenkins build url (#4657 ) This will allow us to trace leaked instances back to the job, so that we can figure out what happened and fix the leak. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Radai Rosenblatt	5760da7d0b	KAFKA-6622; Fix performance issue loading consumer offsets (#4661 ) `batch.baseOffset` is an expensive operation (even says so in its javadoc), and yet was called for every single record in a batch when loading offsets. This means that for N records in a gzipped batch, the entire batch will be unzipped N times. The fix is to compute and cache the base offset once as we decompress and process the batch. Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	7 years ago
Anna Povzner	f1c112c63d	MINOR: Add PayloadGenerator to Trogdor (#4640 ) It generates the producer payload (key and value) and makes sure that the values are populated to target a realistic compression rate (0.3 - 0.4) if compression is used. The generated payload is deterministic and can be replayed from a given position. For now, all generated values are constant size, and key types can be configured to be either null or 8 bytes. Added messageSize parameter to producer spec, that specifies produced key + message size.	7 years ago
Vitaly Pushkar	b1aa1912f0	KAFKA-4831: Extract WindowedSerde to public APIs (#3307 ) Now that we have augmented WindowSerde with non-arg parameters, extract it out as part of the public APIs so that users who want to I/O windowed streams can use it. This is originally introduced by @vitaly-pushkar This PR grows out to be a much larger one, as I found a few tech debts and bugs while working on it. Here is a summary of the PR: Public API changes (I will propose a KIP after a first round of reviews): Add TimeWindowedSerializer, TimeWindowedDeserializer, SessionWindowedSerializer, SessionWindowedDeserializer into o.a.k.streams.kstream. The serializers would implemented an internal WindowedSerializer interface for the serializeBaseKey function used in 3) below. Add WindowedSerdes into o.a.k.streams.kstream. The reason to now add them into o.a.k.clients's Serdes is that it then needs dependency of streams. Add "default.windowed.key.serde.inner" and "default.windowed.value.serde.inner" into StreamsConfig, used when "default.key.serde" is specified to use time or session windowed serde. Note this requires the serde class, not the type class. Consolidated serde format from multiple classes, including SessionKeySerde.java for session, and WindowStoreUtils for time window, into SessionKeySchema and WindowKeySchema. Bug fix: WindowedStreamPartitioner needs to consider both time window and session window serdes. Removed RocksDBWindowBytesStore etc optimization since after KIP-182 all the serde know happens on metered store, hence this optimization is not worth. Bug fix: for time window, the serdes used for store and the serdes used for piping (source and sink node) are different: the former needs to append sequence number but not for the later. Other minor cleanups: remove unnecessary throws, etc. Authors: Guozhang Wang <wangguoz@gmail.com>, Vitaly Pushkar <vitaly.pushkar@gmail.com> Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Xi Hu	7 years ago
Rajini Sivaram	3ef2fb843e	MINOR: Fix record conversion time in metrics (#4671 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Jimin Hsieh	db619c6da2	MINOR: Remove unused local variable in SocketServer (#4669 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
John Roesler	6a383d8bc4	MINOR: clean stateDirectory in TopologyTestDriver (#4655 ) Author: John Roeler <john@confluent.io> Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Damian Guy	989088f697	KAFKA-6560: [FOLLOW-UP] don't deserialize null byte array in window store fetch (#4665 ) If the result of a fetch from a Window Store results in a null byte array we should return null rather than passing it to the serde to deserialize. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
wushujames	c5ba0da993	MINOR: Fix incorrect references to the max transaction timeout config (#4664 )	7 years ago
Jason Gustafson	925d6a2ef3	MINOR: Skip sending fetches/offset lookups when awaiting the reconnect backoff (#4644 ) Logging can get spammy during the reconnect blackout period because any requests we send to ConsumerNetworkClient will immediately be failed when poll() returns. This patch checks for connection failures prior to sending fetches and offset lookups and skips sending to any failed nodes. Test cases added for both. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	7 years ago
nafshartous	cf092aeecc	KAFKA-5660 Don't throw TopologyBuilderException during runtime (#4645 ) Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Jason Gustafson	23c1c52c85	KAFKA-6615; Add scripts for DumpLogSegments (#4653 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Guozhang Wang	e5d6c9a79a	MINOR: Do not start processor for bounce-at-start (#4639 ) Only start it after the broker has been shutdown.	7 years ago
Ewen Cheslack-Postava	d13cbd0cae	KAFKA-3806: Increase offsets retention default to 7 days (KIP-186) (#4648 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>	7 years ago
Bill Bejeck	a1b01f48e9	KAFKA-6309: Improve task assignor load balance (#4624 ) Sorts TaskIds on first assignment evenly distributing tasks by topicGroupId should help with evening the load of work across topologies. This PR is an initial "strawman" approach which will be followed up (at a later date YTBD) by scoring or assigning weight to processing nodes to ensure even processing distribution. Added a new test to existing unit test.	7 years ago
Jason Gustafson	8f2c087166	MINOR: Complete inflight requests in order on disconnect (#4642 ) NetworkClient should use FIFO order when completing inflight requests following a disconnect. I've added new unit tests for `InFlightRequests` and `NetworkClient` which verify completion order. Reviewers: Jun Rao <junrao@gmail.com>	7 years ago
Bill Bejeck	8a7d7e7955	MINOR: Add System test for standby task-rebalancing (#4554 ) Author: Bill Bejeck <bill@confluent.io> Reviewers: Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Matthias J. Sax	2a4ba75e13	KAFKA-6054: Code cleanup to prepare the actual fix for an upgrade path (#4630 ) Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Jason Gustafson	604b93cfde	KAFKA-6606; Ensure consumer awaits auto-commit interval after sending… (#4641 ) We need to reset the auto-commit deadline after sending the offset commit request so that we do not resend it while the request is still inflight. Added unit tests ensuring this behavior and proper backoff in the case of a failure. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Sandor Murakozi	9868976747	KAFKA-6111: Improve test coverage of KafkaZkClient, fix bugs found by new tests;	7 years ago
Jason Gustafson	6cfcc9d553	KAFKA-6593; Fix livelock with consumer heartbeat thread in commitSync (#4625 ) Contention for the lock in ConsumerNetworkClient can lead to a livelock situation in which an active commitSync is unable to make progress because its completion is blocked in the heartbeat thread. The fix is twofold: 1) We change ConsumerNetworkClient to use a fair lock to reduce the chance of each thread getting starved. 2) We eliminate the dependence on the lock in ConsumerNetworkClient for callback completion so that callbacks will not be blocked by an active poll(). Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Guozhang Wang	eb449fe7c5	KAFKA-6560: Replace range query with newly added single point query in Windowed Aggregation (#4578 ) * Add a new fetch(K key, long window-start-timestamp) API into ReadOnlyWindowStore. * Use the new API to replace the range fetch API in KStreamWindowedAggregate and KStreamWindowedReduce. * Added corresponding unit tests. * Also removed some redundant byte serdes in byte stores.	7 years ago
Ewen Cheslack-Postava	f582a7515c	MINOR: Extend release.py with a subcommand for staging docs into the kafka-site repo Author: Ewen Cheslack-Postava <me@ewencp.org> Reviewers: Ismael Juma <ismael@juma.me.uk>, Damian Guy <damian.guy@gmail.com> Closes #3917 from ewencp/stage-docs	7 years ago
Viktor Somogyi	9166ae4ec4	MINOR: Remove unnecessary semicolon in ResourceType (#4626 )	7 years ago
Thomas Leplus	031f522a2d	MINOR: Fix javadoc typo in Headers (#4627 )	7 years ago
Guozhang Wang	97ad549d56	KAFKA-6534: Enforce a rebalance in the next poll call when encounter task migration (#4544 ) The fix is in two folds: For tasks that's closed in closeZombieTask, their corresponding partitions are still in runningByPartition so those closed tasks may still be returned in activeTasks and standbyTasks. Adding guards on the returned tasks and if they are closed notify the thread to trigger rebalance immediately. When triggering a rebalance, un-subscribe and re-subscribe immediately to make sure we are not dependent on the background heartbeat thread timing. Some minor changes on log4j. More specifically, I moved the log entry of closeZombieTask to its callers with more context information and the action going to take. I can re-produce the issue with EosIntegrationTest may hand-code the heartbeat thread to GC, and confirmed this patch fixed the issue. Unfortunately this test cannot be added to AK since currently we do not have ways to manipulate the heartbeat thread in unit tests. Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	f26fbb9adc	MINOR: Rename stream partition assignor to streams partition assignor (#4621 ) This is a straight-forward change that make the name of the partition assignor to be aligned with Streams. Reviewers: Matthias J. Sax <mjsax@apache.org>	7 years ago
Blake Miller	7c5d0c459f	MINOR:Fix typo in the impl source (#4587 ) The static method KStreamImpl.createReparitionedSource() is missing a t. This PR globally fixes the typo and keeps the code indentation consistent.	7 years ago
huxi	2591a46930	KAFKA-5327; Console Consumer should not commit messages not printed (#4546 ) Ensure that the consumer's offsets are reset prior to closing so that any buffered messages which haven't been printed are not committed.	7 years ago
Gilles Degols	3600316a14	KAFKA-6057: Users forget `--execute` in the offset reset tool (#4069 ) Add a small warning note when the user does not use the --execute flag.	7 years ago
Matthias J. Sax	5df535e8a3	MINOR: fixes lgtm.com warnings (#4582 ) fixes lgmt.com warnings cleanup PrintForeachAction and Printed Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Sebastian Bauersfeld <sebastianbauersfeld@gmx.de>, Damian Guy <damian@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Igor Kostiakov	99d650c2c8	KAFKA-6590; Fix bug in aggregation of consumer fetch bytes and counts metrics (#4278 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	7 years ago
Daniel Shuy	4a027a0af1	Fix typo in MockProducer JavaDoc (#4606 )	7 years ago
Matthias J. Sax	0cd83e997d	MINOR: wait for broker startup for system tests (#4363 ) ensure that brokers are registered at ZK before start() returns Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Jason Gustafson	e26d0d7604	MINOR: Revert incompatible behavior change to consumer reset tool (#4611 ) This patch reverts the removal of the --execute option in the offset reset tool and the change to the default behavior when no options were present. For consistency, this patch adds the --execute flag to the streams reset tool, but keeps its current default behavior. A note has been added to both of these commands to warn the user that future default behavior will be to prompt before acting. Test cases were not actually validating that offsets were committed when the --execute option was present, so I have fixed that and added basic assertions for the dry-run behavior. I also removed some duplicated test boilerplate. Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Randall Hauch	0e22fd6f8d	KAFKA-6578: Changed the Connect distributed and standalone main method to log all exceptions (#4609 ) Any exception thrown by calls within a `main()` method are not logged unless explicitly done so. This change simply adds a try-catch block around most of the content of the distributed and standalone `main()` methods.	7 years ago
Colin Patrick McCabe	66039b1312	MINOR: Fix ConcurrentModificationException in TransactionManager (#4608 )	7 years ago
Jason Gustafson	1d8ed875db	MINOR: Fix javadoc for consumer offsets lookup APIs which do not block indefinitely (#4613 ) The blocking time for these APIs is bounded by the request timeout.	7 years ago
Vahid Hashemian	1cabef0d3d	MINOR: Refactor GroupMetadataManager cleanupGroupMetadata (#4504 ) Refactoring avoids the need to call this method with a infinity as current time to remove all group offsets (when manually deleting the group).	7 years ago
Randall Hauch	fc19c3e6f2	KAFKA-6577: Fix Connect system tests and add debug messages NOTE: This should be backported to the `1.1` branch, and is currently a blocker for 1.1. The `connect_test.py::ConnectStandaloneFileTest.test_file_source_and_sink` system test is failing with the SASL configuration without a sufficient explanation. During the test, the Connect worker fails to start, but the Connect log contains no useful information. There are actual several things compounding to cause the failure and make it difficult to understand the problem. First, the `tests/kafkatest/tests/connect/templates/connect_standalone.properties` is only adding in the broker's security configuration with the `producer.` and `consumer.` prefixes, but is not adding them with no prefix. The worker uses the AdminClient to connect to the broker to get the Kafka cluster ID and to manage the three internal topics, and the AdminClient is configured via top-level properties. Because the SASL test requires the clients all connect using SASL, the lack of broker security configs means the AdminClient was attempting and failing to connect to the broker. This is corrected by adding the broker's security configuration to the Connect worker configuration file at the top-level. (This was already being done in the `connect_distributed.properties` file.) Second, the default `request.timeout.ms` for the AdminClient (and the other clients) is 120 seconds, so the AdminClient was retrying for 120 seconds before it would give up and thrown an error. However, the test was only waiting for 60 seconds before determining that the service failed to start. This can be corrected by setting `request.timeout.ms=10000` in the Connect distributed and standalone worker configurations. Third, the Connect workers were recently changed to lookup the Kafka cluster ID before it started the herder. This is unlike the older uses of the AdminClient to find and manage the internal topics, where failure to connect was not necessarily logged correctly but nevertheless still skipped over, relying upon broker auto-topic creation to create the internal topics. (This may be why the test did not fail prior to the recent change to always require a successful AdminClient connection.) Although the worker never got this far in its startup process, the fact that we missed such an error since the prior releases means that failure to connect with the AdminClient was not being properly reported. The `ConnectStandaloneFileTest.test_file_source_and_sink` system tests were run locally prior to this fix, and they failed as with the nightlies. Once these fixes were made, the locally run system tests passed. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <me@ewencp.org> Closes #4610 from rhauch/kafka-6577-trunk	7 years ago
Matthias J. Sax	1715e6eac0	MINOR: Fix Streams-Broker-Compatibility system test (#4594 ) fixes error message handling for test consumer client and KafkaStreams instance updates expected error message fixes race condition in system test code and avoids starting Streams processor twice Author: Matthias J. Sax <matthias@confluent.io.> Reviewer: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Ewen Cheslack-Postava	da379c95e4	MINOR: Cancel port forwarding for HttpMetricsCollector during cleanup Currently port forwarding is setup for HttpMetricsCollector when the Service's start_node method is called, but not canceled during stop. This hasn't presented a problem so far because we don't have tests that use this and restart the service. However, if a test/service does that, it will throw an exception since the port is already bound. This just does the cleanup when stopping so a subsequent attempt to start again will succeed. https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1320 is a test run for a Test that uses ProducerPerformanceService, which in turn uses HttpMetricsCollector to validate the change. Author: Ewen Cheslack-Postava <me@ewencp.org> Reviewers: Ismael Juma <ismael@juma.me.uk>, Apurva Mehta <apurva@confluent.io> Closes #4604 from ewencp/cleanup-reverse-port-forward	7 years ago
UVN	1667c16be1	MINOR: Read configuration fields from ProducerConfig in example (#4601 ) Reading the configuration field names from ProducerConfig class and taking the key and value serializer names from class name directly instead of hardcoding.	7 years ago
Rajini Sivaram	90e0bbec94	KAFKA-6573: Update brokerInfo in KafkaController on listener update (#4603 ) Update `KafkaController.brokerInfo` when listeners are updated since this value is used to register the broker in ZooKeeper if there ZK session expires. Also added test to verify values in ZK after session expiry.	7 years ago
Jason Gustafson	660c0c0aa3	KAFKA-6238; Fix inter-broker protocol message format compatibility check This patch fixes a bug in the validation of the inter-broker protocol and the message format version. We should allow the configured message format api version to be greater than the inter-broker protocol api version as long as the actual message format versions are equal. For example, if the message format version is set to 1.0, it is fine for the inter-broker protocol version to be 0.11.0 because they both use message format v2. I have added a unit test which checks compatibility for all combinations of the message format version and the inter-broker protocol version. Author: Jason Gustafson <jason@confluent.io> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes #4583 from hachikuji/KAFKA-6328-REOPENED	7 years ago
wushujames	75df8cc77a	MINOR: Add docs for ReplicationBytesInPerSec and ReplicationBytesOutPerSec (#4595 ) Add docs for the ReplicationBytesInPerSec and ReplicationBytesOutPerSec metrics. These metrics were introduced in KIP-153 (KAFKA-5194) but the docs were not updated. Built site-docs, and viewed them in a web browser to confirm. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>	7 years ago

... 2 3 4 5 6 ...

4923 Commits (51db468bf329794b586390abf5b3069abb281ffe) All Branches Search

4923 Commits (51db468bf329794b586390abf5b3069abb281ffe)

All Branches