src-kafka

Commit Graph

Author	SHA1	Message	Date
Jagadesh Adireddi	a05e33693b	KAFKA-6677: Fixed StreamsConfig producer's max-in-flight allowed when EOS enabled. (#4868 ) Reviewers: Matthias J Sax <matthias@confluentio>, Bill Bejeck <bill@confluent.io>	7 years ago
Debasish Ghosh	b2e4db01b6	KAFKA-6670: Implement a Scala wrapper library for Kafka Streams This PR implements a Scala wrapper library for Kafka Streams. The library is implemented as a project under streams, namely `:streams:streams-scala`. The PR contains the following: * the library implementation of the wrapper abstractions * the test suite * the changes in `build.gradle` to build the library jar The library has been tested running the tests as follows: ``` $ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdes streams:streams-scala:test $ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdesWithAvro streams:streams-scala:test $ ./gradlew -Dtest.single=WordCountTest streams:streams-scala:test ``` Author: Debasish Ghosh <ghosh.debasish@gmail.com> Author: Sean Glover <seglo@randonom.com> Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #4756 from debasishg/scala-streams	7 years ago
John Roesler	ed51b2cdf5	KAFKA-6376; refactor skip metrics in Kafka Streams * unify skipped records metering * log warnings when things get skipped * tighten up metrics usage a bit ### Testing strategy: Unit testing of the metrics and the logs should be sufficient. Author: John Roesler <john@confluent.io> Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #4812 from vvcephei/kip-274-streams-skip-metrics	7 years ago
Jagadesh Adireddi	b510737e76	KAFKA-5253: Fixed TopologyTestDriver to handle streams created with patterns (#4793 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	1f523d9d72	MINOR: add window store range query in simple benchmark (#4894 ) There are a couple minor additions in this PR: 1. Add a new test for window store, to range query upon receiving each record. 2. In the non-windowed state store case, add a get call before the put call. 3. Enable caching by default to be consistent with other Join / Aggregate cases, where caching is enabled by default. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Bill Bejeck	93fd5707fa	MINOR: Retry setting aligned time until set (#4893 ) In the AbstractResetIntegrationTest we can have a transient error when setting the time for the test where the new time is less than the original time, for those cases we should catch the exception and re-try setting the time once versus letting the test fail. For testing, ran the entire streams test suite. Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Bill Bejeck	2162d0713f	KAFKA-6802: Improved logging for missing topics during task assignment (#4891 ) If users don't create all topics before starting a streams application, they could get unexpected results. For example, sharing a state store between sub-topologies where one input topic is not created ahead time results in log message that that "Partition X is not assigned to any tasks" does not give any clues as to how this could have occurred. Also, this PR changes the log level from INFO to WARN when metadata does not have partitions for a given topic. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Jimin Hsieh	6b2be2693c	KAFKA-6775: Fix the issue of without init super class's (#4859 ) Some anonymous classes of AbstractProcessor didn't initialize their superclass. This will not set up ProcessorContext context at AbstractProcessor. Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Matthias J. Sax	cae42215b7	KAFKA-6054: Update Kafka Streams metadata to version 3 (#4880 ) - adds Streams upgrade tests for 1.1 release - introduces metadata version 3 Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Guozhang Wang	0e0fd4fe8d	HOTFIX: use the new prop object (#4888 )	7 years ago
John Roesler	ac9c3ed0b4	KAFKA-6376: preliminary cleanup (#4872 ) General cleanup of Streams code, mostly resolving compiler warnings and re-formatting. The regular testing suite should be sufficient. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Valentino Proietti	01eddce01f	KAFKA-6742: TopologyTestDriver error when dealing with stores from GlobalKTable guozhangwang While TopologyTestDriver works well with stores created from KTable it does not with stores from GlobalKTable. Moreover, for my testing purposes but I think it can be useful to others, I need to get access to the MockProducer inside TopologyTestDriver. I have added 4 new tests to TopologyTestDriverTest, two for stores from KTable and two for stores from GlobalKTable. While I was changing the TopologyTestDriver I've also make it implement java.io.Closeable. Author: Valentino Proietti <valentino.proietti@kydea.com> Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #4823 from Vale68/KAFKA-6742 minor renaming	7 years ago
Guozhang Wang	9871357086	KAFKA-6592: Follow-up (#4864 ) Do not require ConsoleConsumer to specify inner serde as s special property, but just a normal property of the message formatter.	7 years ago
Guozhang Wang	0dc7f0e66f	KAFKA-6611, PART II: Improve Streams SimpleBenchmark (#4854 ) SimpleBenchmark: 1.a Do not rely on manual num.records / bytes collection on atomic integers. 1.b Rely on config files for num.threads, bootstrap.servers, etc. 1.c Add parameters for key skewness and value size. 1.d Refactor the tests for loading phase, adding tumbling-windowed count. 1.e For consumer / consumeproduce, collect metrics on consumer instead. 1.f Force stop the test after 3 minutes, this is based on empirical numbers of 10M records. Other tests: use config for kafka bootstrap servers. streams_simple_benchmark.py: only use scale 1 for system test, remove yahoo from benchmark tests. Note that the JMX based metrics is more accurate than the manually collected metrics. Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
huxi	4e35a2bfb7	KAFKA-6592: ConsoleConsumer should support WindowedSerdes (#4797 ) Have Console consumer support TimeWindowedDeserializer/SessionWindowedDeserializer. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Matthias J. Sax	0c0d8363e5	KAFKA-6054: Fix upgrade path from Kafka Streams v0.10.0 (#4779 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>	7 years ago
tedyu	ac542c9a83	KAFKA-6747 Check whether there is in-flight transaction before aborting transaction (#4826 ) As Frederic reported on mailing list under the subject "kafka-streams Invalid transition attempted from state READY to state ABORTING_TRANSACTION", producer#abortTransaction should only be called when transactionInFlight is true. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>	7 years ago
fredfp	3abd410708	KAFKA-6748: double check before scheduling a new task after the punctuate call (#4827 ) After the punctuate() call, we would like to double check on the scheduled flag since the call itself may cancel it. Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>	7 years ago
Guozhang Wang	9313e18fbb	KAFKA-6560: Use single query for getters as well (#4814 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>	7 years ago
khairy	1815e01188	MINOR: Refactor return value (#4810 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
huxi	b8f8ce4146	KAFKA-6731: waitOnState should check the state to be the target start. (#4808 ) KafkaStreams.waitOnState() should check the state to be the given one instead of the hard-coded `NOT_RUNNING`. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Guozhang Wang	2e5d4af83f	MINOR: refactor error message of task migration (#4803 ) In the stream thread capture of the TaskMigration exception, print the task full information in WARN. In other places only log as INFO, plus additional context information. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
JieFang.He	fc0d0021cc	KAFKA-6707: The default value for config of Type.LONG should be *L (#4762 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
JieFang.He	cb7cf7c5a7	KAFKA-6702: Wrong className in LoggerFactory.getLogger method (#4772 ) Reviewers: Manikumar Reddy, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	7 years ago
Guozhang Wang	28f1fc2f55	MINOR: Change getMessage to toString (#4790 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Boyang Chen	964693e40d	KAFKA-6386: Use Properties instead of StreamsConfig in KafkaStreams constructor This pull request targets https://issues.apache.org/jira/browse/KAFKA-6386 The minor fix to deprecate usage of `StreamsConfig` in favor of `java.util.Properties`. I created separate public constructors using `Properties` in order to replace the old ones, and prioritize new functions in the `KafkaStreams.java` file. Since this is my first time doing open source contribution, I'm very happy to get any comment or pointer to be more professional and get better next time, thank you Guozhang guozhangwang and Liquan Ishiihara! testing strategy: existing unit test should be suffice to cover this change. Author: cs427fa16staff <bchen11@outlook.com> Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io> Closes #4354 from abbccdda/starter github comments	7 years ago
John Roesler	adbf31ab1d	KAFKA-6473: Add MockProcessorContext to public test-utils (#4736 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Guozhang Wang	f2fbfaaccc	KAFKA-6611: PART I, Use JMXTool in SimpleBenchmark (#4650 ) 1. Use JmxMixin for SimpleBenchmark (will remove the self reporting in #4744), only when loading phase is false (i.e. we are in fact starting the streams app). 2. Reported the full jmx reported metrics in log files, and in the returned data only return the max values: this is because we want to skip the warming up and cooling down periods that will have lower rate numbers, while max represents the actual rate at full speed. 3. Incorporates two other improves to JMXTool: #1241 and #2950 Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Rohan Desai <desai.p.rohan@gmail.com>	7 years ago
Stuart Perks	9ee00c4b66	KAFKA-6659: Improve error message if state store is not found (#4732 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	0f364cd53a	MINOR: Pass a streams config to replace the single state dir (#4714 ) This is a general change and is re-requisite to allow streams benchmark test with different streams tests. For the streams benchmark itself I will have a separate PR for switching configs. Details: 1. Create a "streams.properties" file under PERSISTENT_ROOT before all the streams test. For now it will only contain a single config of state.dir pointing to PERSISTENT_ROOT. 2. For all the system test related code, replace the main function parameter of state.dir with propsFilename, then inside the function load the props from the file and apply overrides if necessary. 3. Minor fixes. Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
asutosh936	02a8ef8595	KAFKA-6486: Implemented LinkedHashMap in TimeWindows (#4628 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Sandor Murakozi	2afac71566	MINOR: Remove unnecessary null checks (#4708 ) Remove unnecessary null check in StringDeserializer, MockProducerInterceptor and KStreamImpl. Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>	7 years ago
Matthias J. Sax	394aa74261	KAFKA-6454: Allow timestamp manipulation in Processor API (#4519 ) Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Kamal C	a6fad27372	KAFKA-6106: Postpone normal processing of tasks within a thread until restoration of all tasks have completed. (#4651 ) Author: Kamal Chandraprakash <kamal.chandraprakash@gmail.com> Reviewer: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Edoardo Comar	35c08189cb	MINOR: fixing streams test-util compilation errors in Eclipse (#4631 ) Author: Edoardo Comar <ecomar@uk.ibm.com> Reviewer: Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	95ad03540f	KAFKA-6634: Delay starting new transaction in task.initializeTopology (#4684 ) As titled, not starting new transaction since during restoration producer would have not activity and hence may cause txn expiration. Also delay starting new txn in resuming until initializing topology. Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>	7 years ago
Detharon	724032bd06	MINOR: Fix incorrect JavaDoc (type mismatch) (#4632 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Jacek Laskowski	c28c556e92	MINOR: Remove code duplication + excessive space (#4683 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
Vitaly Pushkar	b1aa1912f0	KAFKA-4831: Extract WindowedSerde to public APIs (#3307 ) Now that we have augmented WindowSerde with non-arg parameters, extract it out as part of the public APIs so that users who want to I/O windowed streams can use it. This is originally introduced by @vitaly-pushkar This PR grows out to be a much larger one, as I found a few tech debts and bugs while working on it. Here is a summary of the PR: Public API changes (I will propose a KIP after a first round of reviews): Add TimeWindowedSerializer, TimeWindowedDeserializer, SessionWindowedSerializer, SessionWindowedDeserializer into o.a.k.streams.kstream. The serializers would implemented an internal WindowedSerializer interface for the serializeBaseKey function used in 3) below. Add WindowedSerdes into o.a.k.streams.kstream. The reason to now add them into o.a.k.clients's Serdes is that it then needs dependency of streams. Add "default.windowed.key.serde.inner" and "default.windowed.value.serde.inner" into StreamsConfig, used when "default.key.serde" is specified to use time or session windowed serde. Note this requires the serde class, not the type class. Consolidated serde format from multiple classes, including SessionKeySerde.java for session, and WindowStoreUtils for time window, into SessionKeySchema and WindowKeySchema. Bug fix: WindowedStreamPartitioner needs to consider both time window and session window serdes. Removed RocksDBWindowBytesStore etc optimization since after KIP-182 all the serde know happens on metered store, hence this optimization is not worth. Bug fix: for time window, the serdes used for store and the serdes used for piping (source and sink node) are different: the former needs to append sequence number but not for the later. Other minor cleanups: remove unnecessary throws, etc. Authors: Guozhang Wang <wangguoz@gmail.com>, Vitaly Pushkar <vitaly.pushkar@gmail.com> Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Xi Hu	7 years ago
John Roesler	6a383d8bc4	MINOR: clean stateDirectory in TopologyTestDriver (#4655 ) Author: John Roeler <john@confluent.io> Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Damian Guy	989088f697	KAFKA-6560: [FOLLOW-UP] don't deserialize null byte array in window store fetch (#4665 ) If the result of a fetch from a Window Store results in a null byte array we should return null rather than passing it to the serde to deserialize. Reviewers: Guozhang Wang <wangguoz@gmail.com>	7 years ago
nafshartous	cf092aeecc	KAFKA-5660 Don't throw TopologyBuilderException during runtime (#4645 ) Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Bill Bejeck	a1b01f48e9	KAFKA-6309: Improve task assignor load balance (#4624 ) Sorts TaskIds on first assignment evenly distributing tasks by topicGroupId should help with evening the load of work across topologies. This PR is an initial "strawman" approach which will be followed up (at a later date YTBD) by scoring or assigning weight to processing nodes to ensure even processing distribution. Added a new test to existing unit test.	7 years ago
Bill Bejeck	8a7d7e7955	MINOR: Add System test for standby task-rebalancing (#4554 ) Author: Bill Bejeck <bill@confluent.io> Reviewers: Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Matthias J. Sax	2a4ba75e13	KAFKA-6054: Code cleanup to prepare the actual fix for an upgrade path (#4630 ) Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Guozhang Wang	eb449fe7c5	KAFKA-6560: Replace range query with newly added single point query in Windowed Aggregation (#4578 ) * Add a new fetch(K key, long window-start-timestamp) API into ReadOnlyWindowStore. * Use the new API to replace the range fetch API in KStreamWindowedAggregate and KStreamWindowedReduce. * Added corresponding unit tests. * Also removed some redundant byte serdes in byte stores.	7 years ago
Guozhang Wang	97ad549d56	KAFKA-6534: Enforce a rebalance in the next poll call when encounter task migration (#4544 ) The fix is in two folds: For tasks that's closed in closeZombieTask, their corresponding partitions are still in runningByPartition so those closed tasks may still be returned in activeTasks and standbyTasks. Adding guards on the returned tasks and if they are closed notify the thread to trigger rebalance immediately. When triggering a rebalance, un-subscribe and re-subscribe immediately to make sure we are not dependent on the background heartbeat thread timing. Some minor changes on log4j. More specifically, I moved the log entry of closeZombieTask to its callers with more context information and the action going to take. I can re-produce the issue with EosIntegrationTest may hand-code the heartbeat thread to GC, and confirmed this patch fixed the issue. Unfortunately this test cannot be added to AK since currently we do not have ways to manipulate the heartbeat thread in unit tests. Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	f26fbb9adc	MINOR: Rename stream partition assignor to streams partition assignor (#4621 ) This is a straight-forward change that make the name of the partition assignor to be aligned with Streams. Reviewers: Matthias J. Sax <mjsax@apache.org>	7 years ago
Blake Miller	7c5d0c459f	MINOR:Fix typo in the impl source (#4587 ) The static method KStreamImpl.createReparitionedSource() is missing a t. This PR globally fixes the typo and keeps the code indentation consistent.	7 years ago
Matthias J. Sax	5df535e8a3	MINOR: fixes lgtm.com warnings (#4582 ) fixes lgmt.com warnings cleanup PrintForeachAction and Printed Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Sebastian Bauersfeld <sebastianbauersfeld@gmx.de>, Damian Guy <damian@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago

1 2 3 4 5 ...

1017 Commits (e38e3a66ab099996ecb156ec9105869f3d9b9228)