src-kafka

Commit Graph

Author	SHA1	Message	Date
Guozhang Wang	c8c3a7dc48	KAFKA-7192 Follow-up: update checkpoint to the reset beginning offset (#5430 ) 1. When we reinitialize the state store due to no CHECKPOINT with EOS turned on, we should update the checkpoint to consumer.seekToBeginnning() / consumer.position() to avoid falling into endless iterations. 2. Fixed a few other logic bugs around needsInitializing and needsRestoring. Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bbejeck@gmail.com>	6 years ago
Guozhang Wang	061885e9f1	KAFKA-7192: Wipe out if EOS is turned on and checkpoint file does not exist (#5421 ) 1. As titled and as described in comments. 2. Modified unit test slightly to insert for new keys in committed data to expose this issue. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Matthias J. Sax	42af41d5fc	MINOR: Caching layer should forward record timestamp (#5423 ) Reviewer: Guozhang Wang <guozhang@confluent.io>	6 years ago
Bill Bejeck	1d9a427225	KAFKA-7144: Fix task assignment to be even (#5390 ) This PR now justs removes the check in TaskPairs.hasNewPair that was causing the task assignment issue. This was done as we need to further refine task assignment strategy and this approach needs to include the statefulness of tasks and is best done in one pass vs taking a "patchy" approach. Updated current tests and ran locally Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Matthias J. Sax	487b954542	MINOR: internal config objects should not be logged (#5389 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	6 years ago
Rajini Sivaram	4b60ed3247	KAFKA-7193: Use ZooKeeper IP address in streams tests to avoid timeouts (#5414 ) ZooKeeper client from version 3.4.13 doesn't handle connections to localhost very well. If ZooKeeper is started on 127.0.0.1 on a machine that has both ipv4 and ipv6 and a client is created using localhost rather than the IP address in the connection string, ZooKeeper client attempts to connect to ipv4 or ipv6 randomly with a fixed one second backoff if connection fails. Use 127.0.0.1 instead of localhost in streams tests to avoid intermittent test failures due to ZK client connection timeouts if ipv6 is chosen in consecutive address selections. Also add note to upgrade docs for 2.0.0. Reviewers: Ismael Juma <github@juma.me.uk>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Guozhang Wang	75825caee4	KAFKA-5037 Follow-up: move Scala test to Java (#5399 ) Reviewers: Ted Yu <yuzhihong@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Manikumar Reddy O	9089fb2d82	MINOR: Fix format violations streams scala tests (#5402 ) @guozhangwang @mjsax hot fix for streams scala test format violations Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Ted Yu	82f124ae30	KAFKA-5037: Fix infinite loop if all input topics are unknown at startup 1. At the beginning of assign, we first check that all the non-repartition source topics are included in the metadata. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it. 2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code. Author: tedyu <yuzhihong@gmail.com> Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com> Closes #5322 from tedyu/trunk	6 years ago
Manikumar Reddy O	96c53e96b8	MINOR: Remove deprecated ZkUtils usage from EmbeddedKafkaCluster (#5324 ) Reviewers: Matthias J. Sax <mjsax@apache.org>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	2f6240ac94	KAFKA-3514: Remove min timestamp tracker (#5382 ) 1. Remove MinTimestampTracker and its TimestampTracker interface. 2. In RecordQueue, keep track of the head record (deserialized) while put the rest raw bytes records in the fifo queue, the head record as well as the partition timestamp will be updated accordingly. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Matthias J. Sax	06d96628f0	MINOR: remove unused MeteredKeyValueStore (#5380 ) Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	6 years ago
Liquan Pei	08fe24b46a	KAFKA-7103: Use bulkloading for RocksDBSegmentedBytesStore during init (#5276 ) This PR uses bulk loading for recovering RocksDBWindowStore, same as RocksDBStore. Reviewers: Boyang Chen <bchen11@outlook.com>, Shawn Nguyen <shnguyen@pinterest.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
hashangayasri	07647c2a4c	MINOR: make the constructor of InMemoryKeyValueStore public so that it can be re-used by custom (in-memory) stores (#5310 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Joan Goyeau	05c5854d1f	MINOR: Add Scalafmt to Streams Scala API (#4965 ) Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
John Roesler	e38e3a66ab	MINOR: Fix standby streamTime (#5288 ) #5253 broke standby restoration for windowed stores. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	8250738ae4	KAFKA-7101: Consider session store for windowed store default configs (#5298 ) 1. extend isWindowStore to consider session store as well. 2. extend the existing unit test accordingly. Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
John Roesler	64fff8bfcc	KAFKA-7080: replace numSegments with segmentInterval (#5257 ) See also KIP-319. Replace number-of-segments parameters with segment-interval-ms parameters in various places. The latter was always the parameter that several components needed, and we accidentally supplied the former because it was the one available. Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Chia-Ping Tsai	57320981bb	Minor: fix javadocs of StreamsConfig and ValueTransformerWithKey (#5157 ) Reviewer: Matthias J. Sax <matthias@confluent.io>	6 years ago
Yishun Guan	d44d5d7520	KAFKA-6986: Export Admin Client metrics through Stream Threads (#5210 ) KAFKA-6986:Export Admin Client metrics through Stream Threads We already exported producer and consumer metrics through KafkaStreams class: #4998 It makes sense to also export the Admin client metrics. I didn't add a separate unittest case for this. Let me know if it's needed. This is my first contribution, feel free to point out any mistakes that I did. Reviewers: Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	7947c94140	MINOR: Upgrade RocksDB to 5.13.4 (#5309 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Guozhang Wang	6bfaf4dc60	MINOR: Store metrics scope, total metrics (#5290 ) 1. Rename metrics scope of rocksDB window and session stores; also modify the store metrics accordingly with guidance on its correlations to metricsScope. 2. Add the missing total metrics for per-thread, per-task, per-node and per-store sensors. Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Guozhang Wang	be0f10e190	MINOR: KAFKA-7112: Only resume restoration if state is still PARTITIONS_ASSIGNED after poll (#5306 ) Before KIP-266, consumer.poll(0) would call updateAssignmentMetadataIfNeeded(Long.MAX_VALUE), which makes sure that the rebalance is definitely completed, i.e. both onPartitionRevoked and onPartitionAssigned called within this poll(0). After KIP-266, however, it is possible that only onPartitionRevoked will be called if timeout is elapsed. And hence we need to double check that state is still PARTITIONS_ASSIGNED after the consumer.poll(duration) call. Reviewers: Ted Yu <yuzhihong@gmail.com>, Matthias J. Sax <matthias@confluent.io>	6 years ago
Manikumar Reddy O	51935ee2e6	KAFKA-7091; AdminClient should handle FindCoordinatorResponse errors (#5278 ) - Update KafkaAdminClient implementation to handle FindCoordinatorResponse errors - Remove scala AdminClient usage from core and streams tests Reviewers: Matthias J. Sax <matthias@confluent.io>, Jason Gustafson <jason@confluent.io>	6 years ago
Ismael Juma	7a74ec62d2	MINOR: Avoid FileInputStream/FileOutputStream (#5281 ) They rely on finalizers (before Java 11), which create unnecessary GC load. The alternatives are as easy to use and don't have this issue. Also use FileChannel directly instead of retrieving it from RandomAccessFile whenever possible since the indirection is unnecessary. Finally, add a few try/finally blocks. Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
xinzhg	b054789d69	MINOR: Fix comment in quick union (#5244 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
John Roesler	954be11bf2	KAFKA-6978: make window retention time strict (#5218 ) Enforce window retention times strictly: * records for windows that are expired get dropped * queries for timestamps old enough to be expired immediately answered with null Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	d3e264e773	MINOR: update web docs and examples of Streams with Java8 syntax (#5249 ) Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Damian Guy <damian@confluent.io>	7 years ago
John Roesler	6732593bba	KAFKA-7072: clean up segments only after they expire (#5253 ) Significant refactor of Segments to use stream-time as the basis of segment expiration. Previously Segments assumed that the current record time was representative of stream time. In the event of a "future" event (one whose record time is greater than the stream time), this would inappropriately drop live segments. Now, Segments will provision the new segment to house the future event and drop old segments only after they expire. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Stephane Maarek	410e00cbcb	KAFKA-7066 added better logging in case of Serialisation issue (#5239 ) Following the error message of: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/SinkNode.java#L93 Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Ismael Juma	cc4dce94af	KAFKA-2983: Remove Scala consumers and related code (#5230 ) - Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`) and their tests. - Removed Scala request/response/message classes. - Removed any mention of new consumer or new producer in the code with the exception of MirrorMaker where the new.consumer option was never deprecated so we have to keep it for now. The non-code documentation has not been updated either, that will be done separately. - Removed a number of tools that only made sense in the context of the Scala consumers (see upgrade notes). - Updated some tools that worked with both Scala and Java consumers so that they only support the latter (see upgrade notes). - Removed `BaseConsumer` and related classes apart from `BaseRecord` which is used in `MirrorMakerMessageHandler`. The latter is a pluggable interface so effectively public API. - Removed `ZkUtils` methods that were only used by the old consumers. - Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since the broker now uses the methods in `KafkaZkClient` and no-one else should be using that method. - Updated system tests so that they don't use the Scala consumers except for multi-version tests. - Updated LogDirFailureTest so that the consumer offsets topic would continue to be available after all the failures. This was necessary for it to work with the Java consumer. - Some multi-version system tests had not been updated to include recently released Kafka versions, fixed it. - Updated findBugs and checkstyle configs not to refer to deleted classes and packages. Reviewers: Dong Lin <lindong28@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	7 years ago
Bill Bejeck	1354371d4f	KAFKA-6761: Construct logical Streams Graph in DSL Parsing (#4983 ) This version is a WIP and intentionally leaves out some additional required changes to keep the reviewing effort more manageable. This version of the process includes 1. Cleaning up the graph objects to reduce the number of parameters and make the naming conventions more clear. 2. Intercepting all calls to the InternalToplogyBuilder and capturing all details required for possible optimizations and building the final topology. This PR does not include writing out the current physical plan, so no tests included. The next PR will include additional changes to building the graph and writing the topology out without optimizations, using the current streams tests. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
nixsticks	339fc2379d	KAFKA-7055: Update InternalTopologyBuilder to throw TopologyException if a processor or sink is added with no upstream node attached (#5215 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Matthias J. Sax	0dfa53c47a	KAFKA-6711: GlobalStateManagerImpl should not write offsets of in-memory stores in checkpoint file (#5219 )	7 years ago
Matthias J. Sax	ff96d57437	KAFKA-6860: Fix NPE in Kafka Streams with EOS enabled (#5187 ) Reviewers: John Roesler <john@confluent.io>, Ko Byoung Kwon, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
John Roesler	ce7fe8fe5f	MINOR: Use new consumer API timeout in test (#5217 ) The old timeout configs no longer take effect, as of 53ca52f855e903907378188d29224b3f9cefa6cb. They are replaced by the new one. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Jagadesh Adireddi	c903d5767e	KAFKA-6749: Fixed TopologyTestDriver to process stream processing guarantee as exactly once (#4912 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>	7 years ago
Matthias J. Sax	301474f0ba	MINOR: code cleanup follow up for KAFKA-6906 (#5196 ) Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Filipe Agapito	de4f4f530a	KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 2] (#4986 ) * KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 2] * Refactor: -KTableFilterTest.java -KTableImplTest.java -KTableMapValuesTest.java -KTableSourceTest.java * Add access to task, processorTopology, and globalTopology in TopologyTestDriver via TopologyTestDriverWrapper * Remove unnecessary constructor in TopologyTestDriver * Change how TopologyTestDriverWrapper#getProcessorContext sets the current node Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Gitomain	40f63eb9c1	KAFKA-6782: solved the bug of restoration of aborted messages for GlobalStateStore and KGlobalTable (#4900 ) Reviewer: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Guozhang Wang	7a59061252	KAFKA-7023: Add unit test (#5197 ) Add a unit test that validates after restoreStart, the options are set with bulk loading configs; and after restoreEnd, it resumes to the customized configs Reviewers: Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	d98ec33364	KAFKA-7021: Reuse source based on config (#5163 ) This PR actually contains two changes: 1. leverage on the TOPOLOGY_OPTIMIZATION config to "adjust" the topology internally to reuse the source topic. 2. fixed a long dangling bug that whenever source topic is reused as changelog topic, write the checkpoint file for the consumed offset, this is done by union the ackedOffset from the producer, plus the consumed offset from the consumer, note we will priori ackedOffset since the same topic may show up in both (think about repartition topic), by doing this the consumed offset from source topics can be treated as checkpointed offset when reuse happens. 3. added a few unit and integration tests with / wo the reusing, and make sure the restoration, standby task, and internal topic creation behaviors are all correct. Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Jagadesh Adireddi	ee5cc974d2	KAFKA-6906: Fixed to commit transactions if data is produced via wall clock punctuation (#5105 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Liquan Pei	cc4157d877	KAFKA-7023: Move prepareForBulkLoad() call after customized RocksDBConfigSetter (#5166 ) *Summary options.prepareForBulkLoad() and then use the configs from the customized customized RocksDBConfigSetter. This may overwrite the configs set in prepareBulkLoad call. The fix is to move prepareBulkLoad call after applying configs customized RocksDBConfigSetter. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
John Roesler	74bdafe386	KAFKA-5697: Use nonblocking poll in Streams (#5107 ) Make use of the new Consumer#poll(Duration) to avoid getting stuck in poll when the broker is unavailable. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Matthias J. Sax	bb260e924f	MINOR: remove duplicate map in StoreChangelogReader (#5143 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Jagadesh Adireddi	150967994a	KAFKA-6538: Changes to enhance ByteStore exceptions thrown from RocksDBStore with more human readable info (#5103 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Bill Bejeck	f54acdbb13	KAFKA-6935: Add config for allowing optional optimization (#5071 ) Adding configuration to StreamsConfig allowing for making topology optimization optional. Added unit tests are verifying default values, setting correct value and failure on invalid values. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Matthias J. Sax	0eddddb82b	KAFKA-6967: TopologyTestDriver does not allow pre-populating state stores that have change logging (#5096 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, James Cheng <jylcheng@yahoo.com>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>	7 years ago
Lee Dongjin	594a0e1a07	KAFKA-6993: Fix defective documentations for KStream/KTable methods (#5136 ) * KAFKA-6993: Fix defective documentations for KStream/KTable methods 1. Fix the documentation of following methods, e.g., making more detailed description for the overloaded methods: - KStream#join - KStream#leftJoin - KStream#outerJoin - KTable#filter - KTable#filterNot - KTable#mapValues - KTable#transformValues - KTable#join - KTable#leftJoin - KTable#outerJoin 2. (trivial) with possible new type -> with possibly new type. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago

1 2 3 4 5 ...

1132 Commits (5d7cb438a5607fd1bba35ee7a7cf1b2924bae45d)