src-kafka

Author	SHA1	Message	Date
Lee Dongjin	b28aa4ece6	KAFKA-9586: Fix errored json filename in ops documentation This PR is the counterpart of apache/kafka-site#253. cc/ omkreddy Author: Lee Dongjin <dongjin@apache.org> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8149 from dongjinleekr/feature/KAFKA-9586	5 years ago
Ron Dagostino	d9b8b86bdd	KAFKA-9575: Mention ZooKeeper 3.5.7 upgrade More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers. Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes. Author: Ron Dagostino <rdagostino@confluent.io> Reviewers: Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #8139 from rondagostino/KAFKA-9575	5 years ago
Guozhang Wang	3b6573c150	KAFKA-9481: Graceful handling TaskMigrated and TaskCorrupted (#8058 ) 1. Removed task field from TaskMigrated; the only caller that encodes a task id from StreamTask actually do not throw so we only log it. To handle it on StreamThread we just always enforce rebalance (and we would call onPartitionsLost to remove all tasks as dirty). 2. Added TaskCorruptedException with a set of task-ids. The first scenario of this is the restoreConsumer.poll which throws InvalidOffset indicating that the logs are truncated / compacted. To handle it on StreamThread we first close the corresponding tasks as dirty (if EOS is enabled we would also wipe out the state stores), and then revive them into the CREATED state. 3. Also fixed a bug while investigating KAFKA-9572: when suspending / closing a restoring task we should not commit the new offsets but only updating the checkpoint file. 4. Re-enabled the unit test.	5 years ago
A. Sophie Blee-Goldman	0d16c26e3c	HOTFIX: don't try to remove uninitialized changelogs from assignment & don't prematurely mark task closed (#8140 ) This fixes two issues which together caused the soak to crash/some test to fail occasionally. What happened was: In the main StreamThread loop we initialized a new task in TaskManager#checkForCompletedRestoration which includes registering, but not initializing, its changelogs. We then complete the loop and call poll, which resulted in a rebalance that revoked the newly-initialized task. In TaskManager#handleAssignment we then closed the task cleanly and go to remove the changelogs from the StoreChangelogReader only to get an IllegalStateException because the changelog partitions were not in the restore consumer's assignment (due to being uninitialized). This by itself should^ be a recoverable error, as we catch exceptions here and retry closing the task as unclean. Of course the task actually was successfully closed (clean) so we now get an unexpected exception Illegal state CLOSED while closing active task The fix(es) I'd propose are: 1. Keep the restore consumer's assignment in sync with the registered changelogs, ie the set ChangelogReader#changelogs but pause them until they are initialized edit: since the consumer does still perform some actions (gg fetches) on paused partitions, we should avoid adding uninitialized changelogs to the restore consumer's assignment. Instead, we should just skip them when removing. 2. Move the StoreChangelogReader#remove call to before the task.closeClean so that the task is only marked as closed if everything was successful. We should do so regardless, as we should (attempt to) remove the changelogs even if the clean close failed and we must do unclean. Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Lee Dongjin	b2aec9496d	MINOR: Fix javadoc at org.apache.kafka.clients.producer.KafkaProducer.InterceptorCallback#onCompletion (#7337 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	776565f7a8	MINOR: Improve EOS example exception handling (#8052 ) The current EOS example mixes fatal and non-fatal error handling. This patch fixes this problem and simplifies the example. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Mickael Maison	8ab0994919	MINOR: Fix a number of warnings in clients test (#8073 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Andrew Choi <li_andchoi@microsoft.com>	5 years ago
zshuo	6d4cfad013	MINOR: Update shell scripts to support z/OS system (#7913 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	5 years ago
Gunnar Morling	04b6641c14	MINOR: Wording fix in Streams DSL docs (#5692 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	5 years ago
Stanislav Kozlovski	f51d06712a	MINOR: Add missing @Test annotation to MetadataTest#testMetadataMerge (#8141 ) Reviewers: Brian Byrne <bbyrne@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Michael Viamari	a41d3d86c1	KAFKA-9533: ValueTransform forwards `null` values (#8108 ) Fixes a bug where KStream#transformValues would forward null values from the provided ValueTransform#transform operation. A test was added for verification KStreamTransformValuesTest#shouldEmitNoRecordIfTransformReturnsNull. A parallel test for non-key ValueTransformer was not added, as they share the same code path. Reviewers: Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bbejeck@gmail.com>	5 years ago
Lee Dongjin	9942e20a58	MINOR: fix omitted 'next.' in interactive queries documentation (#7883 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Jason Gustafson	5a19fe6cd1	KAFKA-9544; Fix flaky test `AdminClientTest.testDefaultApiTimeoutOverride` (#8101 ) There is a race condition with the backoff sleep in the test case and setting the next allowed send time in the AdminClient. To fix it, we allow the test case to do the backoff sleep multiple times if needed. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Sanjana Kaundinya	eb7dfef245	KAFKA-9558; Fix retry logic in KafkaAdminClient listOffsets (#8119 ) This PR is to fix the retry logic for `getListOffsetsCalls`. Previously, if there were partitions with errors, it would only pass in the current call object to retry after a metadata refresh. However, if there's a leader change, the call object never gets updated with the correct leader node to query. This PR fixes this by making another call to `getListOffsetsCalls` with only the error topic partitions as the next calls to be made after the metadata refresh. In addition there is an additional test to test the scenario where a leader change occurs. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Boyang Chen	913c61934e	MINOR: Reduce log level to Trace for fetch offset downgrade (#8093 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
David Mao	72122fc069	KAFKA-6266: Repeated occurrence of WARN Resetting first dirty offset (#8089 ) Previously, checkpointed offsets for a log were only updated if the log was chosen for cleaning once the cleaning job completes. This caused issues in cases where logs with invalid checkpointed offsets would repeatedly emit warnings if the log with an invalid cleaning checkpoint wasn't chosen for cleaning. Proposed fix is to update the checkpointed offset for logs with invalid checkpoints regardless of whether it gets chosen for cleaning. Reviewers: Anna Povzner <anna@confluent.io>, Jun Rao <junrao@gmail.com>	5 years ago
Matthias J. Sax	ebcdcd9fa9	KAFKA-8025: Fix flaky RocksDB test (#8126 ) Reviewers: Bill Bejeck <bill@confluent.io>	5 years ago
Chia-Ping Tsai	413db69d41	KAFKA-8245: Fix Flaky Test DeleteConsumerGroupsTest#testDeleteCmdAllGroups (#8032 ) Change unit tests to make sure the consumer group is in Stable state (i.e. consumers have completed joining the group)	5 years ago
Ismael Juma	2c0c2c595b	KAFKA-9515: Upgrade ZooKeeper to 3.5.7 (#8125 ) A couple of critical fixes: ZOOKEEPER-3644: Data loss after upgrading standalone ZK server 3.4.14 to 3.5.6 with snapshot.trust.empty=true ZOOKEEPER-3701: Split brain on log disk full (3.5) Full release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801&version=12346098 Reviewers: Bill Bejeck <bbejeck@gmail.com>	5 years ago
Rajini Sivaram	b1449f683c	MINOR: Add upgrade note about TLSv1 and TLSv1.1 being disabled in 2.5.0 (#8128 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Nikolay	f364281431	KAFKA-9319: Fix generation of CA certificate for system tests. (#8106 ) Newer versions of Java have added checks to ensure that trust anchors are CA certificates and contain proper extensions. This PR adds Basic Constraints extension with the CA field set to true for system tests. Reviewers: ajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
Boyang Chen	97c5dc1c13	KAFKA-9545: Fix subscription bugs from Stream refactoring (#8109 ) This PR fixes two bugs related to stream refactoring: 1. The subscribed topics are not updated correctly when topic gets removed from broker. 2. The remainingPartitions computation doesn't account the case when one task has a pattern subscription of multiple topics. Then the input partition change will not be assumed as containsAll The bugs are exposed from integration test testRegexMatchesTopicsAWhenDeleted and could be used to verify the fix works. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Boyang Chen	863b534f83	KAFKA-9535; Update metadata before retrying partitions when fetching offsets (#8088 ) Today if we attempt to list offsets with a fenced leader epoch, consumer will retry without updating the metadata until the timeout is reached. This affects synchronous APIs such as `offsetsForTimes`, `beginningOffsets`, and `endOffsets`. The fix in this patch is to trigger the metadata update call whenever we see a retriable error before additional attempts. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Bob Barrett	937f1f741c	KAFKA-8805; Bump producer epoch on recoverable errors (#7389 ) This change is the client-side part of KIP-360. It identifies cases where it is safe to abort a transaction, bump the producer epoch, and allow the application to continue without closing the producer. In these cases, when KafkaProducer.abortTransaction() is called, the producer sends an InitProducerId following the transaction abort, which causes the producer epoch to be bumped. The application can then start a new transaction and continue processing. For recoverable errors in the idempotent producer, the epoch is bumped locally. In-flight requests for partitions with an error are rewritten to reflect the new epoch, and in-flights of all other partitions are allowed to complete using the old epoch. Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>	5 years ago
Guozhang Wang	d8756e81c5	KAFKA-9274: Gracefully handle timeout exception (#8060 ) 1. Delay the initialization (producer.initTxn) from construction to maybeInitialize; if it times out we just swallow and retry in the next iteration. 2. If completeRestoration (consumer.committed) times out, just swallow and retry in the next iteration. 3. For other calls (producer.partitionsFor, producer.commitTxn, consumer.commit), treat the timeout exception as fatal. Reviewers: Matthias J. Sax <matthias@confluent.io>	5 years ago
Konstantine Karantasis	16ee326755	KAFKA-9556; Fix two issues with KIP-558 and expand testing coverage (#8085 ) Correct the Connect worker logic to properly disable the new topic status (KIP-558) feature when `topic.tracking.enable=false`, and fix automatic topic status reset after a connector is deleted. Also adds new `ConnectorTopicsIntegrationTest` and expanded unit tests. Reviewers: Randall Hauch <rhauch@gmail.com>	5 years ago
John Roesler	8d0b069b0f	KAFKA-9557: correct thread process-rate sensor to measure throughput (#8112 ) Correct the process-rate (and total) sensor to measure throughput (and total record processing count). Reviewers: Guozhang Wang <guozhang@confluent.io>	5 years ago
vinoth chandar	37d0b12cfc	KAFKA-9512: Flaky Test LagFetchIntegrationTest.shouldFetchLagsDuringRestoration (#8076 ) - Added additional synchronization and increased timeouts to handle flakiness - Added some pre-cautionary retries when trying to obtain lag map Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
Jason Gustafson	9da6823a3a	HOTFIX: Fix breakage in `ConsumerPerformanceTest` (#8113 ) Test cases in `ConsumerPerformanceTest` were failing and causing a system exit rather than throwing the expected exception following #8023. We didn't catch this because the tests were marked as skipped and not failed. Reviewers: Guozhang Wang <guozhang@confluent.io>	5 years ago
Mitch	96c69da8c1	KAFKA-8507; Unify connection name flag for command line tool [KIP-499] (#8023 ) This change updates ConsoleProducer, ConsumerPerformance, VerifiableProducer, and VerifiableConsumer classes to add and prefer the --bootstrap-server flag for defining the connection point of the Kafka cluster. This change is part of KIP-499: https://cwiki.apache.org/confluence/display/KAFKA/KIP-499+-+Unify+connection+name+flag+for+command+line+tool. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>	5 years ago
Xavier Léauté	7e1c39f75a	KAFKA-9106 make metrics exposed via jmx configurable (#7674 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	5 years ago
Stanislav Kozlovski	ea72edebf2	MINOR: Do not override retries for idempotent producers (#8097 ) The KafkaProducer code would set infinite retries (MAX_INT) if the producer was configured with idempotence and no retries were configured by the user. This is superfluous because KIP-91 changed the retry functionality to both be time-based and the default retries config to be MAX_INT. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
huxi	46e80dbd20	KAFKA-9538; Fix flaky test `testResetOffsetsExportImportPlan` (#6561 ) This patch adds logic to the test case to ensure that consumer groups are in a valid state prior to attempting offset reset. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Konstantine Karantasis	97d2c726f1	MINOR: Small Connect integration test fixes (#8100 ) Author: Konstantine Karantasis <konstantine@confluent.io> Reviewer: Randall Hauch <rhauch@gmail.com>	5 years ago
Lev Zemlyanov	f51e1e6548	allow ReplaceField SMT to handle tombstone records (#7731 ) Signed-off-by: Lev Zemlyanov <lev@confluent.io>	5 years ago
Lev Zemlyanov	c8f1ee9cd9	KAFKA-9192: fix NPE when for converting optional json schema in structs (#7733 ) Author: Lev Zemlyanov <lev@confluent.io> Reviewers: Greg Harris <gregh@confluent.io>, Randall Hauch <rhauch@gmail.com>	5 years ago
Boyang Chen	07db26c20f	KAFKA-9417: New Integration Test for KIP-447 (#8000 ) This change mainly have 2 components: 1. extend the existing transactions_test.py to also try out new sendTxnOffsets(groupMetadata) API to make sure we are not introducing any regression or compatibility issue a. We shrink the time window to 10 seconds for the txn timeout scheduler on broker so that we could trigger expiration earlier than later 2. create a completely new system test class called group_mode_transactions_test which is more complicated than the existing system test, as we are taking rebalance into consideration and using multiple partitions instead of one. For further breakdown: a. The message count was done on partition level, instead of global as we need to visualize the per partition order throughout the test. For this sake, we extend ConsoleConsumer to print out the data partition as well to help message copier interpret the per partition data. b. The progress count includes the time for completing the pending txn offset expiration c. More visibility and feature improvements on TransactionMessageCopier to better work under either standalone or group mode. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
Matthias J. Sax	aa0d0ec32a	KAFKA-6607: Commit correct offsets for transactional input data (#8091 ) Reviewers: Guozhang Wang <guozhang@confluent.io>	5 years ago
David Jacot	2cbd3d7519	KAFKA-9499; Improve deletion process by batching more aggressively (#8053 ) This PR speeds up the deletion process by doing the following: - Batch whenever possible to minimize the number of requests sent out to other brokers; - Refactor `onPartitionDeletion` to remove the usage of `allLiveReplicas`. Reviewers: Jason Gustafson <jason@confluent.io>	5 years ago
Jason Gustafson	0a5dec0b3a	MINOR: Fix unnecessary metadata fetch before group assignment (#8095 ) The recent increase in the flakiness of one of the offset reset tests (KAFKA-9538) traces back to https://github.com/apache/kafka/pull/7941. After investigation, we found that following this patch, the consumer was sending an additional metadata request prior to performing the group assignment. This slight timing difference was enough to trigger the test failures. The problem turned out to be due to a bug in `SubscriptionState.groupSubscribe`, which no longer counted the local subscription when determining if there were new topics to fetch metadata for. Hence the extra metadata update. This patch restores the old logic. Without the fix, we saw 30-50% test failures locally. With it, I could no longer reproduce the failure. However, #6561 is probably still needed to improve the resilience of this test. Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	5 years ago
John Roesler	1681c78f60	KAFKA-9500: Fix FK Join Topology (#8015 ) Corrects a flaw leading to an exception while building topologies that include both: * A foreign-key join with the result not explicitly materialized * An operation after the join that requires source materialization Also corrects a flaw in TopologyTestDriver leading to output records being enqueued in the wrong order under some (presumably rare) circumstances. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	5 years ago
John Roesler	998f1520f9	KAKFA-9503: Fix TopologyTestDriver output order (#8065 ) Migrates TopologyTestDriver processing to be closer to the same processing/ordering semantics as KafkaStreams. This corrects the output order for recursive topologies as reported in KAFKA-9503, and also works similarly in the case of task idling.	5 years ago
Bruno Cadonna	cde6d18983	KAFKA-9355: Fix bug that removed RocksDB metrics after failure in EOS (#7996 ) * Added init() method to RocksDBMetricsRecorder * Added call to init() of RocksDBMetricsRecorder to init() of RocksDB store * Added call to init() of RocksDBMetricsRecorder to openExisting() of segmented state stores * Adapted unit tests * Added integration test that reproduces the situation in which the bug occurred Reviewers: Guozhang Wang <wangguoz@gmail.com>	5 years ago
John Roesler	e16859dc48	KAFKA-9390: Make serde pseudo-topics unique (#8054 ) During the discussion for KIP-213, we decided to pass "pseudo-topics" to the internal serdes we use to construct the wrapper serdes for CombinedKey and hashing the left-hand-side value. However, during the implementation, this strategy wasn't fully implemented, and we wound up using the same topic name for a few different data types. Reviewers: Guozhang Wang <guozhang@confluent.io>	5 years ago
Colin Patrick McCabe	a149c3effa	MINOR: improve error reporting in DescribeConsumerGroupTest (#8080 ) Reviewers: David Arthur <mumrah@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>	5 years ago
high.lee	dc89c86d43	KAFKA-9483: Add Scala KStream#toTable to the Streams DSL (#8024 ) Part of KIP-523 Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>	5 years ago
Matthias J. Sax	50aead64b9	MINOR: fix and improve StreamsConfig JavaDocs (#8086 ) Reviewer: John Roesler <john@confluent.io>	5 years ago
Konstantine Karantasis	0f14aef8cb	MINOR: Start using Response and replace IOException in EmbeddedConnectCluster for failures (#8055 ) Changed `EmbeddedConnectCluster` to add utility methods that return `Response`, throw `ConnectException` instead of `IOException` for failures, and deprecate the old methods that returned primitive types rather than `Response`. Also introduce common assertions for embedded clusters under `EmbeddedConnectClusterAssertions`. Author: Konstantine Karantasis <konstantine@confluent.io> Reviewer: Randall Hauch <rhauch@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	5 years ago
Gunnar Morling	5727b24509	KAFKA-7052 Avoiding NPE in ExtractField SMT in case of non-existent fields (#8059 ) Author: Gunnar Morling <gunnar.morling@googlemail.com> Reviewer: Randall Hauch <rhauch@gmail.com>	5 years ago
John Roesler	520a76155c	KAFKA-9517: Fix default serdes with FK join (#8061 ) During the KIP-213 implementation and verification, we neglected to test the code path for falling back to default serdes if none are given in the topology. Reviewer: Bill Bejeck <bbejeck@gmail.com>	5 years ago

... 3 4 5 6 7 ...

7336 Commits (6e0d553350cef876f4fd2de0e3b8e6e40ce6be44) All Branches Search

7336 Commits (6e0d553350cef876f4fd2de0e3b8e6e40ce6be44)

All Branches