src-kafka

Commit Graph

Author	SHA1	Message	Date
John Roesler	9dac615d22	KAFKA-7386: streams-scala should not cache serdes (#5622 ) Currently, scala.Serdes.String, for example, invokes Serdes.String() once and caches the result. However, the implementation of the String serde has a non-empty configure method that is variant in whether it's used as a key or value serde. So we won't get correct execution if we create one serde and use it for both keys and values. Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	283a19481d	KAFKA-3514: Part III, Refactor StreamThread main loop (#5428 ) * Refactor the StreamThread main loop, in the following: 1. Fetch from consumer and enqueue data to tasks. 2. Check if any tasks should be enforced process. 3/ Loop over processable tasks and process them for N iterations, and then check for 1) commit, 2) punctuate, 3) need to call consumer.poll 4. Even if there is not data to process in this iteration, still need to check if commit / punctuate is needed 5. Finally, try update standby tasks. Add an optimization to only commit when it is needed (i.e. at least some process() or punctuate() was triggered since last commit). Found and fixed a ProducerFencedException scenario: producer.send() call would never throw a ProducerFencedException directly, but it may throw a KafkaException whose "cause" is a ProducerFencedException. Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
Joan Goyeau	acd3858ea6	KAFKA-7396: Materialized, Serialized, Joined, Consumed and Produced with implicit Serdes (#5551 ) We want to make sure that we always have a serde for all Materialized, Serialized, Joined, Consumed and Produced. For that we can make use of the implicit parameters in Scala. KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>	6 years ago
Ewen Cheslack-Postava	0d535b167a	MINOR: Mark new Scala streams tests as integration tests (KIP-270 follow-up) (#5631 ) Reviewers: Eno Thereska <eno.thereska@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
huxi	aa7358e8cc	KAFKA-7326: KStream.print() should flush on each line for PrintStream (#5579 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	6 years ago
John Roesler	d57fe1b053	MINOR: single Jackson serde for PageViewTypedDemo (#5590 ) Previously, we depicted creating a Jackson serde for every pojo class, which becomes a burden in practice. There are many ways to avoid this and just have a single serde, so we've decided to model this design choice instead. Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Bill Bejeck	5fb2cdb5c7	MINOR:Use StoreBuilder.name() for node name (#5588 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
tedyu	34d9ae6628	MINOR: Fix streams Scala peek recursive call (#5566 ) This PR fixes the previously recursive call of Streams Scala peek Reviewers: Joan Goyeau <joan@goyeau.com>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Joan Goyeau	7d2b95895a	MINOR: Correct folder for package object scala (#5573 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Sam Lendle	6f48978b62	KAFKA-7240: -total metrics in Streams are incorrect (#5467 ) Changes: 1. Add org.apache.kafka.streams.processor.internals.metrics.CumulativeCount analogous to Count, but not a SampledStat 2. Use CumulativeCount for -total metrics in streams instead of Count Testing strategy: Add a test in StreamsMetricsImplTest which fails on old, incorrect behavior The contribution is my original work and I license the work to the project under the project's open source license. Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Guozhang Wang	89cf515aec	MINOR: replace deprecated remove with delete (#5565 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	6 years ago
John Roesler	1dd85a328f	MINOR: restructure Windows to favor immutable implementation (#5536 ) Update to KIP-328. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>	6 years ago
Joan Goyeau	b8559de23d	MINOR: Fix streams Scala foreach recursive call (#5539 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Kamal Chandraprakash	c968267bf1	MINOR: Return correct instance of SessionWindowSerde (#5546 ) Plus minor javadoc cleanups. Reviewers: Matthias J. Sax <matthias@confluent.io>,Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Joan Goyeau	d8f9f278a2	KAFKA-7316: Fix Streams Scala filter recursive call #5538 Due to lack of conversion to kstream Predicate, existing filter method in KTable.scala would result in StackOverflowError. This PR fixes the bug and adds testing for it. Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Joan Goyeau	dae9c41838	KAFKA-7301: Fix streams Scala join ambiguous overload (#5502 ) Join in the Scala streams API is currently unusable in 2.0.0 as reported by @mowczare: #5019 (comment) This due to an overload of it with the same signature in the first curried parameter. See compiler issue that didn't catch it: https://issues.scala-lang.org/browse/SI-2628 Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	6 years ago
Joan Goyeau	a289865266	MINOR: Small refactorings on KTable joins (#5540 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
tedyu	3afcbebf39	Minor: set task to null at the end of shouldWrapProducerFencedExceptionWithTaskMigragedExceptionInSuspendWhenCommitting (#5534 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	6 years ago
Mickael Maison	80294d0f5a	MINOR: Improved configuration formatting in documentation (#5532 ) Reviewers: Jason Gustafson <jason@confluent.io>	6 years ago
tedyu	f9f0b5c675	Minor: add valueChangingOperation and mergeNode to StreamsGraphNode#toString (#5522 ) This PR adds valueChangingOperation and mergeNode to StreamsGraphNode#toString Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Bill Bejeck	bf2c46a0b8	MINOR: Change order of Property Check To Avoid NPE (#5528 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	6 years ago
Bill Bejeck	7c4e672469	MINOR: Update test to wait for final value to reduce flakiness updated test method for multiple keys (#5517 ) Updated two integration tests to use IntegrationTestUtils#waitUntilFinalKeyValueRecordsReceived to eliminate flaky test results. Also, I updated IntegrationTestUtils#waitUntilFinalKeyValueRecordsReceived method to support having results with the same key present with different values. For testing, I ran the current suite of streams tests. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	0d778987ee	KAFKA-6998: Disable Caching when max.cache.bytes are zero. (#5488 ) 1) As titled, add a rewriteTopology that 1) sets application id, 2) maybe disable caching, 3) adjust for source KTable. This optimization can hence be applied for both DSL or PAPI generated Topology. 2) Defer the building of globalStateStores in rewriteTopology so that we can also disable caching. But we still need to build the state stores before InternalTopologyBuilder.build() since we should only build global stores once for all threads. 3) Added withCachingDisabled to StoreBuilder, it is a public API change. 4) [Optional] Fixed unit test config setting functionalities, and set the necessary config to shorten the unit test latency (now it reduces from 5min to 3.5min on my laptop). Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Ted Yu <yuzhihong@gmail.com>	6 years ago
Matthias J. Sax	3c14feb442	KAFKA-7285: Create new producer on each rebalance if EOS enabled (#5501 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
John Roesler	a4a65abcd3	MINOR: (re)add equals/hashCode to *Windows (#5510 ) Andy Coates <big-andy-coates@users.noreply.github.com>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Bill Bejeck	3e64e5b9c0	KAFKA-6761: Reduce streams footprint part IV add optimization (#5451 ) This PR adds the optimization of eliminating multiple repartition topics when the KStream resulting from a key-changing operation executes other methods using the new key and reduces the repartition topics to one. Note that this PR leaves in place the optimization for re-using a source topic as a changelog topic for source KTable instances. I'll have another follow-up PR to move the source topic optimization to a method within InternalStreamsBuilder so it can be performed in the same area of the code. Additionally, the current value of StreamsConfig.OPTIMIZE is all and we'll need to have another KIP to change the value to 2.1. An integration test RepartitionOptimizingIntegrationTest which asserts the same results for an optimized topology with one repartition topic as the un-optimized version with four repartition topics. Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	b3771ba22a	KAFKA-7222: Add Windows grace period (#5369 ) Part I of KIP-238: * add grace period to Windows * deprecate retention/maintainMs and segmentInterval from Windows * record expired records in the store with a new metric * record late record drops as a new metric instead of as a "skipped record" Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Koen De Groote	9e0e29ac5c	MINOR: Use statically compiled regular expressions for efficiency (#5168 ) Reviewers: Andras Beni <andrasbeni@cloudera.com>, Sriharsha Chintalapani <sriharsha@apache.org>, Jason Gustafson <jason@confluent.io>	6 years ago
John Roesler	6d1685fa45	KAFKA-7284: streams should unwrap fenced exception (#5499 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	6 years ago
nprad	79a2f892ca	KAFKA-6966: Extend TopologyDescription to better represent Source and (#5284 ) Implements KIP-321 Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
John Roesler	b1539ff62d	KAFKA-7250: switch scala transform to TransformSupplier (#5481 ) #5468 introduced a breaking API change that was actually avoidable. This PR re-introduces the old API as deprecated and alters the API introduced by #5468 to be consistent with the other methods also, fixed misc syntax problems	6 years ago
Kamal Chandraprakash	13a7544418	MINOR: Fixed log in Topology Builder. (#5477 ) - fix log statement in Topology Builder. - addressed some warnings shown by Intellij Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Satish Duggana <satishd@apache.org>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Bill Bejeck	59ae73482d	MINOR: Follow up for KAFKA-6761 graph should add stores for consistency (#5453 ) While working on 4th PR, I noticed that I had missed adding stores via the graph vs. directly via the InternalStreamsBuilder. Probably ok to do so, but we should be consistent. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	d12ceacd7a	KAFKA-7158: Add unit test for window store range queries (#5466 ) While debugging the reported issue, I found that our current unit test lacks coverage to actually expose the underlying root cause. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Manikumar Reddy O	a9d7f8a1fd	MINOR: Fix Streams scala format violations (#5472 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Michal Dziemianko	ed13d7eebb	KAFKA-7250: fix transform function in scala DSL to accept TranformerSupplier (#5468 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Manikumar Reddy O	e75048d3e5	MINOR: increase timeout values in streams tests (#5461 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	b9f1179694	MINOR: clean up window store interface to avoid confusion (#5359 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
Manikumar Reddy O	924466ad62	MINOR: close producer instance in AbstractJoinIntegrationTest (#5459 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	cf2c5e9ffc	MINOR: clean up node and store sensors (#5450 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
John Roesler	3637b2c374	MINOR: Require final variables in Streams (#5452 ) Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Guozhang Wang	afe00effe2	KAFKA-3514: Part II, Choose tasks with data on all partitions to process (#5398 ) 1. In each iteration, decide if a task is processable if all of its partitions contains data, so it can decide which record to process next. 1.a Add one exception that, if the task indeed have data on some but not all of its partitions, we only consider as not processable for some finite round of iterations. 1.b Add a task-level metric to record whenever we are forced to process a task that is only "partially data available", since it may leads to non-determinism. 2. Break the main loop on put-raw-data and process-them. Since now not all data put into the queue would be processed completely within a single iteration. 3. NOTE that within an iteration, if a task has exhausted one of its queue it will still be processed, since we only update processable list once in each iteration, I'm improving on this on the follow-up part III PR. 4. Found and fixed a bug in metrics recording: the taskName and sensorName parameters were exchanged. 5. Optimized task stream time computation again since our current partition stream time reasoning has been simplified. 6. Added unit tests. Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>, Bill Bejeck <bbejeck@gmail.com>	6 years ago
Matthias J. Sax	b083ed66b9	MINOR: improve JavaDocs for Streams PAPI WordCountExample (#5442 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
Bill Bejeck	c19213ab41	KAFKA-6761: Construct Physical Plan using Graph, Reduce streams footprint part III (#5201 ) The specific changes in this PR from the second PR include: 1. Changed the types of graph nodes to names conveying more context 2. Build the entire physical plan from the graph, after StreamsBuilder.build() is called. Other changes are addressed directly as review comments on the PR. Testing consists of using all existing streams tests to validate building the physical plan with graph Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Jason Gustafson	c3e7c0bcb2	MINOR: Producers should set delivery timeout instead of retries (#5425 ) Use delivery timeout instead of retries when possible and remove various TODOs associated with completion of KIP-91. Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	aa48791297	KAFKA-7161: check invariant: oldValue is in the state (#5366 ) Reviewers: Vasily Sulatskov <redvasily@gmail.com>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	814fbe0fea	MINOR: Remove 1 minute minimum segment interval (#5323 ) * new minimum is 0, just like window size * refactor tests to use smaller segment sizes as well Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Bill Bejeck	e09d6d796f	KAFKA-7027: Add overloaded build method to StreamsBuilder (#5437 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Lee Dongjin	495c78db6f	KAFKA-6999: Add description on read-write lock vulnerability of ReadOnlyKeyValueStore (#5351 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	c8c3a7dc48	KAFKA-7192 Follow-up: update checkpoint to the reset beginning offset (#5430 ) 1. When we reinitialize the state store due to no CHECKPOINT with EOS turned on, we should update the checkpoint to consumer.seekToBeginnning() / consumer.position() to avoid falling into endless iterations. 2. Fixed a few other logic bugs around needsInitializing and needsRestoring. Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bbejeck@gmail.com>	6 years ago

1 2 3 4 5 ...

1081 Commits (9dac615d228c5b3464c6322aea9f9ce70f9ef37b)