src-kafka

Commit Graph

Author	SHA1	Message	Date
Ismael Juma	3930dd7e75	KAFKA-4554; Fix ReplicaBuffer.verifyChecksum to use iterators instead of iterables This was changed in `b58b6a1bef` and caused the `ReplicaVerificationToolTest.test_replica_lags` system test to start failing. I also added a unit test and a couple of other minor clean-ups. Author: Ismael Juma <ismael@juma.me.uk> Reviewers: Jason Gustafson <jason@confluent.io> Closes #2280 from ijuma/kafka-4554-fix-replica-buffer-verify-checksum	8 years ago
Guozhang Wang	7f4b278c03	MINOR: Add more exception information in ProcessorStateManager Author: Guozhang Wang <wangguoz@gmail.com> Reviewers: Damian Guy, Jun Rao Closes #2276 from guozhangwang/KMinor-exception-message	8 years ago
Damian Guy	56c61745de	KAFKA-4540: Suspended tasks that are not assigned to the StreamThread need to be closed before new active and standby tasks are created During `onPartitionsAssigned` first close, and remove, any suspended `StandbyTasks` that are no longer assigned to this consumer. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2266 from dguy/kafka-4540	8 years ago
Damian Guy	0321bf5aa6	KAFKA-4473: RecordCollector should handle retriable exceptions more strictly The `RecordCollectorImpl` currently drops messages on the floor if an exception is non-null in the producer callback. This will result in message loss and violates at-least-once processing. Rather than just log an error in the callback, save the exception in a field. On subsequent calls to `send`, `flush`, `close`, first check for the existence of an exception and throw a `StreamsException` if it is non-null. Also, in the callback, if an exception has already occurred, the `offsets` map should not be updated. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2249 from dguy/kafka-4473	8 years ago
Damian Guy	65acff32d1	KAFKA-4534: StreamPartitionAssignor only ever updates the partitionsByHostState and metadataWithInternalTopics on first assignment partitionsByHostState and metadataWithInternalTopics need to be updated on each call to onAssignment() otherwise they contain invalid/stale metadata. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Matthias J. Sax, Guozhang Wang Closes #2256 from dguy/4534	8 years ago
Damian Guy	056ed86600	KAFKA-4539: StreamThread is not correctly creating StandbyTasks Tasks that don't have any `StateStore`s wont have a `StandbyTask`, so `createStandbyTask` can return `null`. We need to check for this in `StandbyTaskCreator.createTask(...)` Also, the checkpointed offsets for `StandbyTask`s are never loaded. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang Closes #2255 from dguy/kafka-4539	8 years ago
Damian Guy	233cd4b18a	KAFKA-4537: StreamPartitionAssignor incorrectly adds standby partitions to the partitionsByHostState map If a KafkaStreams app is using Standby Tasks then `StreamPartitionAssignor` will add the standby partitions to the partitionsByHostState map for each host. This is incorrect as the partitionHostState map is used to resolve which host is hosting a particular store for a key. The result is that doing metadata lookups for interactive queries can return an incorrect host Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang Closes #2254 from dguy/KAFKA-4537	8 years ago
Matthias J. Sax	c2cfadf254	MINOR: update KStream JavaDocs Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2258 from mjsax/minorKStreamJavaDoc	8 years ago
Eno Thereska	ea724497a8	HOTFIX: Streams fix state transition stuck on rebalance This fixes a problem where the Kafka instance state transition gets stuck on rebalance (Thanks to dguy for pointing out). Also adjusts the test in QueryableStateIntegration test. Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Damian Guy, Matthias J. Sax, Guozhang Wang Closes #2252 from enothereska/hotfix_state_never_running	8 years ago
Damian Guy	448f194c70	KAFKA-4532: StateStores can be connected to the wrong source topic resulting in incorrect metadata returned from Interactive Queries When building a topology with tables and StateStores, the StateStores are mapped to the source topic names. This map is retrieved via TopologyBuilder.stateStoreNameToSourceTopics() and is used in Interactive Queries to find the source topics and partitions when resolving the partitions that particular keys will be in. There is an issue where by this mapping for a table that is originally created with builder.table("topic", "table");, and then is subsequently used in a join, is changed to the internal repartition topic. This is because the mapping is updated during the call to topology.connectProcessorAndStateStores(..). In the case that the stateStoreNameToSourceTopics Map already has a value for the state store name it should not update the Map. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Matthias J. Sax, Guozhang Wang Closes #2250 from dguy/kafka-4532	8 years ago
Matthias J. Sax	8591137869	KAFKA-4509: Task reusage on rebalance fails for threads on same host Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Damian Guy, Guozhang Wang Closes #2233 from mjsax/kafka-4509-task-reusage-fix	8 years ago
Damian Guy	21d7e6f19b	KAFKA-4516: When a CachingStateStore is closed it should clear its associated NamedCache Clear and remove the NamedCache from the ThreadCache when a CachingKeyValueStore or CachingWindowStore is closed. Validate that the store is open when doing any queries or writes to Caching State Stores. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Guozhang Wang Closes #2235 from dguy/kafka-4516	8 years ago
Matthias J. Sax	6f7ed15dad	KAFKA-4510: StreamThread must finish rebalance in state PENDING_SHUTDOWN Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Eno Thereska, Guozhang Wang Closes #2227 from mjsax/kafka-4510-finish-rebalance-on-shutdown	8 years ago
Matthias J. Sax	1d586cb50a	KAFKA-4476: Kafka Streams gets stuck if metadata is missing - break loop in StreamPartitionAssigner.assign() in case partition metadata is missing - fit state transition issue (follow up to KAFKA-3637: Add method that checks if streams are initialised) - some test improvements Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Eno Thereska, Ismael Juma, Guozhang Wang Closes #2209 from mjsax/kafka-4476-stuck-on-missing-metadata	8 years ago
Eno Thereska	7c7becd4cb	KAFKA-4486: Don't commit offsets on exception Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang Closes #2225 from enothereska/KAFKA-4486-exception-commit	8 years ago
Matthias J. Sax	9bed8fbcfc	KAFKA-4393: Improve invalid/negative TS handling Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Michael G. Noll, Eno Thereska, Damian Guy, Guozhang Wang Closes #2117 from mjsax/kafka-4393-improveInvalidTsHandling	8 years ago
Matthias J. Sax	1949a76bc4	MINOR: Update JavaDoc of KStream interface Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Damian Guy, Eno Thereska, Guozhang Wang Closes #2153 from mjsax/javaDocKStreams	8 years ago
Guozhang Wang	600859e77c	KAFKA-4392; Handle NoSuchFileException gracefully in StateDirectory Author: Guozhang Wang <wangguoz@gmail.com> Reviewers: Damian Guy <damian.guy@gmail.com>, Ismael Juma <ismael@juma.me.uk> Closes #2121 from guozhangwang/K4392-race-dir-cleanup	8 years ago
Damian Guy	e9a67a8daa	KAFKA-4492: Make the streams cache eviction policy tolerable to reentrant puts The NamedCache wasn't correctly dealing with its re-entrant nature. This would result in the LRU becoming corrupted, and the above exception occurring during eviction. For example: Cache A: dirty key 1 eviction runs on Cache A Node for key 1 gets marked as clean Entry for key 1 gets flushed downstream Downstream there is a processor that also refers to the table fronted by Cache A Downstream processor puts key 2 into Cache A This triggers an eviction of key 1 again ( it is still the oldest node as hasn't been removed from the LRU) As the Node for key 1 is clean flush doesn't run and it is immediately removed from the cache. So now we have dirtyKey set with key =1, but the value doesn't exist in the cache. Downstream processor tries to put key = 1 into the cache, it fails as key =1 is in the dirtyKeySet. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Guozhang Wang Closes #2226 from dguy/cache-bug	8 years ago
Damian Guy	a4592a1864	KAFKA-4488: UnsupportedOperationException during initialization of StandbyTask Instead of throwing `UnsupportedOperationException` from `StandbyTask.recordCollector()` return a No-op implementation of `RecordCollector`. Refactored `RecordCollector` to have an interface and impl. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Guozhang Wang Closes #2212 from dguy/standby-task	8 years ago
Eno Thereska	8fd5b6a666	HOTFIX: Temporary suspension of 2 tests Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes #2199 from enothereska/hotfix-streams-test-reset-ignore	8 years ago
Eno Thereska	93804d50ff	2198 Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2198 from enothereska/hotfix-stream-states	8 years ago
Ismael Juma	128d0ff91d	KAFKA-2247; Merge kafka.utils.Time and kafka.common.utils.Time Also: * Make all implementations of `Time` thread-safe as they are accessed from multiple threads in some cases. * Change default implementation of `MockTime` to use two separate variables for `nanoTime` and `currentTimeMillis` as they have different `origins`. Author: Ismael Juma <ismael@juma.me.uk> Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, Shikhar Bhushan <shikhar@confluent.io>, Jason Gustafson <jason@confluent.io>, Eno Thereska <eno.thereska@gmail.com>, Damian Guy <damian.guy@gmail.com> Closes #2095 from ijuma/kafka-2247-consolidate-time-interfaces	8 years ago
Eno Thereska	ea42d65354	KAFKA-3637: Added initial states Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Ismael Juma, Dan Norwood, Xavier Léauté, Damian Guy, Michael G. Noll, Matthias J. Sax, Guozhang Wang Closes #2135 from enothereska/KAFKA-3637-streams-state	8 years ago
Eno Thereska	7d0f3e75ad	KAFKA-4427: Skip topic groups with no tasks Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com> Closes #2171 from enothereska/KAFKA-4427-topicgroups-with-no-tasks	8 years ago
Matthias J. Sax	724cddbc56	KAFKA-4331: Kafka Streams resetter is slow because it joins the same group for each topic - bug-fix follow up - Resetter fails if no intermediate topic is used because seekToEnd() commit ALL partitions to EOL Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Michael G. Noll, Roger Hoover, Guozhang Wang Closes #2138 from mjsax/kafka-4331-streams-resetter-bugfix	8 years ago
Eno Thereska	dcea5f8388	KAFKA-4355: Skip topics that have no partitions Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Damian Guy, Matthias J. Sax, Guozhang Wang Closes #2133 from enothereska/KAFKA-4355-topic-not-found	8 years ago
Jeyhun Karimov	eaf0e4af34	KAFKA-3825: Allow users to specify different types of state stores in Streams DSL Author: Jeyhun Karimov <je.karimov@gmail.com> Reviewers: Damian Guy, Guozhang Wang Closes #1588 from jeyhunkarimov/KAFKA-3825	8 years ago
Damian Guy	2daa10d77f	KAFKA-4366: KafkaStreams.close() blocks indefinitely Added `timeout` and `timeUnit` to `KafkaStreams.close(..)`. Now do close on a thread and `join` that thread with the provided `timeout`. Changed `state` in `KafkaStreams` to use an enum. Added system test to ensure we don't deadlock on close when an uncaught exception handler that calls `System.exit(..)` is used and there is also a shutdown hook that calls `KafkaStreams.close(...)` Author: Damian Guy <damian.guy@gmail.com> Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang Closes #2097 from dguy/kafka-4366	8 years ago
Eno Thereska	04a13e82a6	KAFKA-4359: Remove commit interval in integration tests for testing caching effects Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com> Closes #2124 from enothereska/KAFKA-4359-intergration-tests-commit1	8 years ago
Damian Guy	64a860c585	KAFKA-4379: Remove caching of dirty and removed keys from StoreChangeLogger The `StoreChangeLogger` currently keeps a cache of dirty and removed keys and will batch the changelog records such that we don't send a record for each update. However, with KIP-63 this is unnecessary as the batching and de-duping is done by the caching layer. Further, the `StoreChangeLogger` relies on `context.timestamp()` which is likely to be incorrect when caching is enabled Author: Damian Guy <damian.guy@gmail.com> Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang Closes #2103 from dguy/store-change-logger	8 years ago
Dan Norwood	0fc1898bf6	HOTFIX: failing to close this iterator causes leaks in rocksdb guozhangwang dguy Author: dan norwood <norwood@confluent.io> Reviewers: Damian Guy, Michael G. Noll, Guozhang Wang Closes #2122 from norwood/close-call	8 years ago
Matthias J. Sax	d8fa4006cb	MINOR: improve exception message for incompatible Serdes to actual key/value data types Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Michael G. Noll, Guozhang Wang Closes #2118 from mjsax/hotfixImproveSerdeTypeMissmatchError	8 years ago
Guozhang Wang	cc62b4f844	revert streams/src/main/java/org/apache/kafka/streams/processor/ConsumerRecordTimestampExtractor.java	8 years ago
Matthias J. Sax	6972d9476f	MINOR: add upgrade guide for Kafka Streams API Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Michael G. Noll, Eno Thereska Closes #2114 from mjsax/updateDocUpgradeSection	8 years ago
Damian Guy	a66ca1624c	MINOR: fix incorrect logging in StreamThread Fix incorrect logging when unable to create an active task. The output was: Failed to create an active task %s: It should have the taskId. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Ismael Juma, Eno Thereska Closes #2109 from dguy/minor-logging	8 years ago
Damian Guy	f1e1b6e744	MINOR: remove unused fields from KTableImpl Remove `keySerde`, `valSerde`, `OUTERTHIS_NAME`, `OUTEROTHER_NAME`, `LEFTTHIS_NAME`, `LEFTOTHER_NAME` from `KTableImpl` as they are all unused fields Author: Damian Guy <damian.guy@gmail.com> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2119 from dguy/minor-ktable-unused	8 years ago
Damian Guy	7c36fc3776	KAFKA-4311: Multi layer cache eviction causes forwarding to incorrect ProcessorNode Given a topology like the one below. If a record arriving in `tableOne` causes a cache eviction, it will trigger the `leftJoin` that will do a `get` from `reducer-store`. If the key is not currently cached in `reducer-store`, but is in the backing store, it will be put into the cache, and it may also trigger an eviction. If it does trigger an eviction and the eldest entry is dirty it will flush the dirty keys. It is at this point that a ClassCastException is thrown. This occurs because the ProcessorContext is still set to the context of the `leftJoin` and the next child in the topology is `mapValues`. We need to set the correct `ProcessorNode`, on the context, in the `ForwardingCacheFlushListener` prior to calling `context.forward`. We also need to remember to reset the `ProcessorNode` to the previous node once `context.forward` has completed. ``` final KTable<String, String> one = builder.table(Serdes.String(), Serdes.String(), tableOne, tableOne); final KTable<Long, String> two = builder.table(Serdes.Long(), Serdes.String(), tableTwo, tableTwo); final KTable<String, Long> reduce = two.groupBy(new KeyValueMapper<Long, String, KeyValue<String, Long>>() { Override public KeyValue<String, Long> apply(final Long key, final String value) { return new KeyValue<>(value, key); } }, Serdes.String(), Serdes.Long()) .reduce(new Reducer<Long>() {..}, new Reducer<Long>() {..}, "reducer-store"); one.leftJoin(reduce, new ValueJoiner<String, Long, String>() {..}) .mapValues(new ValueMapper<String, String>() {..}); ``` Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Guozhang Wang Closes #2051 from dguy/kafka-4311	8 years ago
Xavier Léauté	8ab8f37b43	MINOR: fix typos and incorrect docs Author: Xavier Léauté <xavier@confluent.io> Reviewers: Matthias J. Sax, Guozhang Wang Closes #2112 from xvrl/minor-doc-fixes	8 years ago
Dan Norwood	11766ad318	MINOR: some trace logging for streams debugging Author: Ubuntu <norwood@confluent.io> Reviewers: Eno Thereska <eno.thereska@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io> Closes #1882 from norwood/streams-logging	8 years ago
Damian Guy	92287849e4	MINOR: remove commented out code and System.out.println Remove commented out code and System.out.println from KTableKTableJoinIntegrationTest Author: Damian Guy <damian.guy@gmail.com> Reviewers: Matthias J. Sax, Eno Thereska, Guozhang Wang Closes #2092 from dguy/cleanup-comments	8 years ago
Matthias J. Sax	fbbe5821c2	KAFKA-4352: instable ResetTool integration test - increased timeout to stabilize test Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Eno Thereska, Guozhang Wang Closes #2082 from mjsax/kafka-4352-hotfix	8 years ago
Damian Guy	ba322e5420	KAFKA-4361: Streams does not respect user configs for "default" params Enable user provided consumer and producer configs to override the streams default configs. Author: Damian Guy <damian.guy@gmail.com> Reviewers: Eno Thereska, Matthias J. Sax, Guozhang Wang Closes #2084 from dguy/kafka-4361	8 years ago
Matthias J. Sax	76b0702841	HOTFIX: improve error message on invalid input record timestamp Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Guozhang Wang, Ismael Juma, Michael G. Noll, Eno Thereska Closes #2076 from mjsax/hotfixTSExtractor	8 years ago
Matthias J. Sax	18223415b0	KAFKA-4302: Simplify KTableSource KTableSource is always materialized since IQ: - removed flag KTableSource#materialized - removed MaterializedKTableSourceProcessor Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Eno Thereska, Guozhang Wang Closes #2065 from mjsax/kafka-4302-simplify-ktablesource	8 years ago
Eno Thereska	29ea4b0f14	KAFKA-3559: Recycle old tasks when possible Author: Eno Thereska <eno.thereska@gmail.com> Reviewers: Damian Guy, Guozhang Wang Closes #2032 from enothereska/KAFKA-3559-onPartitionAssigned	8 years ago
Guozhang Wang	a4ab9d02a2	KAFKA-4117: Stream partitionassignro cleanup 1. Create a new `ClientMetadata` to collapse `Set<String> consumerMemberIds`, `ClientState<TaskId> state`, and `HostInfo hostInfo`. 2. Stop reusing `stateChangelogTopicToTaskIds` and `internalSourceTopicToTaskIds` to access the (sub-)topology's internal repartition and changelog topics for clarity; also use the source topics num.partitions to set the num.partitions for repartition topics, and clarify to NOT have cycles since otherwise the while loop will fail. 3. `ensure-copartition` at the end to modify the number of partitions for repartition topics if necessary to be equal to other co-partition topics. 4. Refactor `ClientState` as well and update the logic of `TaskAssignor` for clarity as well. 5. Change default `clientId` from `applicationId-suffix` to `applicationId-processId` where `processId` is an UUID to avoid conflicts of clientIds that are from different JVMs, and hence conflicts in metrics. 6. Enforce `assignment` partitions to have the same size, and hence 1-1 mapping to `activeTask` taskIds. 7. Remove the `AssignmentSupplier` class by always construct the `partitionsByHostState` before assigning tasks to consumers within a client. 8. Remove all unnecessary member variables in `StreamPartitionAssignor`. 9. Some other minor fixes on unit tests, e.g. remove `test only` functions with java field reflection. Author: Guozhang Wang <wangguoz@gmail.com> Reviewers: Xavier Léauté, Matthias J. Sax, Eno Thereska, Jason Gustafson Closes #2012 from guozhangwang/K4117-stream-partitionassignro-cleanup	8 years ago
Matthias J. Sax	dab3617e1a	MINOR: improve JavaDoc for Streams window retention time Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2068 from mjsax/hotfixImproveWindowRetentionTimeJavaDoc	8 years ago
Ismael Juma	d092673838	MINOR: A bunch of clean-ups related to usage of unused variables There should be only one cases where these clean-ups have a functional impact: replaced repeated identical logs with a single log for the stale controller epoch case. The rest should just make the code easier to read and make it a bit less wasteful. I did this exercise because unused variables sometimes mask bugs. Author: Ismael Juma <ismael@juma.me.uk> Reviewers: Jason Gustafson <jason@confluent.io> Closes #1985 from ijuma/remove-unused	8 years ago
Matthias J. Sax	63da487213	KAFAK-4058: Failure in org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterReset - fixed consumer group dead condition - disabled state store cache Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Guozhang Wang <wangguoz@gmail.com> Closes #2056 from mjsax/KAFKA-4058-instableResetToolTest	8 years ago

... 3 4 5 6 7 ...

538 Commits (46aa88b9cf82971a0890f4f83efa9639f84c677b)