src-kafka

Commit Graph

Author	SHA1	Message	Date
Ismael Juma	7a74ec62d2	MINOR: Avoid FileInputStream/FileOutputStream (#5281 ) They rely on finalizers (before Java 11), which create unnecessary GC load. The alternatives are as easy to use and don't have this issue. Also use FileChannel directly instead of retrieving it from RandomAccessFile whenever possible since the indirection is unnecessary. Finally, add a few try/finally blocks. Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>	6 years ago
xinzhg	b054789d69	MINOR: Fix comment in quick union (#5244 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	6 years ago
John Roesler	954be11bf2	KAFKA-6978: make window retention time strict (#5218 ) Enforce window retention times strictly: * records for windows that are expired get dropped * queries for timestamps old enough to be expired immediately answered with null Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	6 years ago
Guozhang Wang	d3e264e773	MINOR: update web docs and examples of Streams with Java8 syntax (#5249 ) Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Damian Guy <damian@confluent.io>	7 years ago
John Roesler	6732593bba	KAFKA-7072: clean up segments only after they expire (#5253 ) Significant refactor of Segments to use stream-time as the basis of segment expiration. Previously Segments assumed that the current record time was representative of stream time. In the event of a "future" event (one whose record time is greater than the stream time), this would inappropriately drop live segments. Now, Segments will provision the new segment to house the future event and drop old segments only after they expire. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Stephane Maarek	410e00cbcb	KAFKA-7066 added better logging in case of Serialisation issue (#5239 ) Following the error message of: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/SinkNode.java#L93 Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Ismael Juma	cc4dce94af	KAFKA-2983: Remove Scala consumers and related code (#5230 ) - Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`) and their tests. - Removed Scala request/response/message classes. - Removed any mention of new consumer or new producer in the code with the exception of MirrorMaker where the new.consumer option was never deprecated so we have to keep it for now. The non-code documentation has not been updated either, that will be done separately. - Removed a number of tools that only made sense in the context of the Scala consumers (see upgrade notes). - Updated some tools that worked with both Scala and Java consumers so that they only support the latter (see upgrade notes). - Removed `BaseConsumer` and related classes apart from `BaseRecord` which is used in `MirrorMakerMessageHandler`. The latter is a pluggable interface so effectively public API. - Removed `ZkUtils` methods that were only used by the old consumers. - Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since the broker now uses the methods in `KafkaZkClient` and no-one else should be using that method. - Updated system tests so that they don't use the Scala consumers except for multi-version tests. - Updated LogDirFailureTest so that the consumer offsets topic would continue to be available after all the failures. This was necessary for it to work with the Java consumer. - Some multi-version system tests had not been updated to include recently released Kafka versions, fixed it. - Updated findBugs and checkstyle configs not to refer to deleted classes and packages. Reviewers: Dong Lin <lindong28@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>	7 years ago
Bill Bejeck	1354371d4f	KAFKA-6761: Construct logical Streams Graph in DSL Parsing (#4983 ) This version is a WIP and intentionally leaves out some additional required changes to keep the reviewing effort more manageable. This version of the process includes 1. Cleaning up the graph objects to reduce the number of parameters and make the naming conventions more clear. 2. Intercepting all calls to the InternalToplogyBuilder and capturing all details required for possible optimizations and building the final topology. This PR does not include writing out the current physical plan, so no tests included. The next PR will include additional changes to building the graph and writing the topology out without optimizations, using the current streams tests. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
nixsticks	339fc2379d	KAFKA-7055: Update InternalTopologyBuilder to throw TopologyException if a processor or sink is added with no upstream node attached (#5215 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Matthias J. Sax	0dfa53c47a	KAFKA-6711: GlobalStateManagerImpl should not write offsets of in-memory stores in checkpoint file (#5219 )	7 years ago
Matthias J. Sax	ff96d57437	KAFKA-6860: Fix NPE in Kafka Streams with EOS enabled (#5187 ) Reviewers: John Roesler <john@confluent.io>, Ko Byoung Kwon, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
John Roesler	ce7fe8fe5f	MINOR: Use new consumer API timeout in test (#5217 ) The old timeout configs no longer take effect, as of 53ca52f855e903907378188d29224b3f9cefa6cb. They are replaced by the new one. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Jagadesh Adireddi	c903d5767e	KAFKA-6749: Fixed TopologyTestDriver to process stream processing guarantee as exactly once (#4912 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>	7 years ago
Matthias J. Sax	301474f0ba	MINOR: code cleanup follow up for KAFKA-6906 (#5196 ) Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Filipe Agapito	de4f4f530a	KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 2] (#4986 ) * KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 2] * Refactor: -KTableFilterTest.java -KTableImplTest.java -KTableMapValuesTest.java -KTableSourceTest.java * Add access to task, processorTopology, and globalTopology in TopologyTestDriver via TopologyTestDriverWrapper * Remove unnecessary constructor in TopologyTestDriver * Change how TopologyTestDriverWrapper#getProcessorContext sets the current node Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Gitomain	40f63eb9c1	KAFKA-6782: solved the bug of restoration of aborted messages for GlobalStateStore and KGlobalTable (#4900 ) Reviewer: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Guozhang Wang	7a59061252	KAFKA-7023: Add unit test (#5197 ) Add a unit test that validates after restoreStart, the options are set with bulk loading configs; and after restoreEnd, it resumes to the customized configs Reviewers: Matthias J. Sax <matthias@confluent.io>	7 years ago
Guozhang Wang	d98ec33364	KAFKA-7021: Reuse source based on config (#5163 ) This PR actually contains two changes: 1. leverage on the TOPOLOGY_OPTIMIZATION config to "adjust" the topology internally to reuse the source topic. 2. fixed a long dangling bug that whenever source topic is reused as changelog topic, write the checkpoint file for the consumed offset, this is done by union the ackedOffset from the producer, plus the consumed offset from the consumer, note we will priori ackedOffset since the same topic may show up in both (think about repartition topic), by doing this the consumed offset from source topics can be treated as checkpointed offset when reuse happens. 3. added a few unit and integration tests with / wo the reusing, and make sure the restoration, standby task, and internal topic creation behaviors are all correct. Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Jagadesh Adireddi	ee5cc974d2	KAFKA-6906: Fixed to commit transactions if data is produced via wall clock punctuation (#5105 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Liquan Pei	cc4157d877	KAFKA-7023: Move prepareForBulkLoad() call after customized RocksDBConfigSetter (#5166 ) *Summary options.prepareForBulkLoad() and then use the configs from the customized customized RocksDBConfigSetter. This may overwrite the configs set in prepareBulkLoad call. The fix is to move prepareBulkLoad call after applying configs customized RocksDBConfigSetter. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
John Roesler	74bdafe386	KAFKA-5697: Use nonblocking poll in Streams (#5107 ) Make use of the new Consumer#poll(Duration) to avoid getting stuck in poll when the broker is unavailable. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Matthias J. Sax	bb260e924f	MINOR: remove duplicate map in StoreChangelogReader (#5143 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Jagadesh Adireddi	150967994a	KAFKA-6538: Changes to enhance ByteStore exceptions thrown from RocksDBStore with more human readable info (#5103 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Bill Bejeck	f54acdbb13	KAFKA-6935: Add config for allowing optional optimization (#5071 ) Adding configuration to StreamsConfig allowing for making topology optimization optional. Added unit tests are verifying default values, setting correct value and failure on invalid values. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Matthias J. Sax	0eddddb82b	KAFKA-6967: TopologyTestDriver does not allow pre-populating state stores that have change logging (#5096 ) Reviewers: Guozhang Wang <guozhang@confluent.io>, James Cheng <jylcheng@yahoo.com>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>	7 years ago
Lee Dongjin	594a0e1a07	KAFKA-6993: Fix defective documentations for KStream/KTable methods (#5136 ) * KAFKA-6993: Fix defective documentations for KStream/KTable methods 1. Fix the documentation of following methods, e.g., making more detailed description for the overloaded methods: - KStream#join - KStream#leftJoin - KStream#outerJoin - KTable#filter - KTable#filterNot - KTable#mapValues - KTable#transformValues - KTable#join - KTable#leftJoin - KTable#outerJoin 2. (trivial) with possible new type -> with possibly new type. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Rajini Sivaram	a1ca07d316	MINOR: Bump version to 2.1.0-SNAPSHOT (#5153 )	7 years ago
ConcurrencyPractitioner	ba0ebca7a5	[KAFKA-6730] Simplify State Store Recovery (#5013 ) Reviewer: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>	7 years ago
Bill Bejeck	ef413699b6	KAFKA-6704: InvalidStateStoreException from IQ when StreamThread closes store (#4801 ) While using an iterator from IQ, it's possible to get an InvalidStateStoreException if the StreamThread closes the store during a range query. Added a unit test to SegmentIteratorTest for this condition. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
John Roesler	ba5fd3c8a4	MINOR: Add regression tests for KTable mapValues and filter (#5134 ) Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Rajini Sivaram	9df3872fbd	KAFKA-3665: Enable TLS hostname verification by default (KIP-294) (#4956 ) Make HTTPS the default ssl.endpoint.identification.algorithm. Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
John Roesler	6f9f365573	KAFKA-6813: return to double-counting for count topology names (#5075 ) #4919 unintentionally changed the topology naming scheme. This change returns to the prior scheme. Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Guozhang Wang	718d6f2475	MINOR: Remove deprecated KafkaStreams constructors in docs (#5118 ) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Vahid Hashemian	0cacbcf30e	MINOR: Remove usages of JavaConversions and fix some typos (#5115 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	7 years ago
Bill Bejeck	cb2f024f87	MINOR: Use thread name and task for sensor name (#5111 ) Changes to keep the operation name as is and make the sensor name unique. Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
emmanuel Harel	e24916a68f	MINOR:Fix table outer join test (#5099 )	7 years ago
Joan Goyeau	ad56f04af9	KAFKA-6936: Implicit materialized for aggregate, count and reduce (#5066 ) In #4919 we propagate the SerDes for each of these aggregation operators. As @guozhangwang mentioned in that PR: ``` reduce: inherit the key and value serdes from the parent XXImpl class. count: inherit the key serdes, enforce setting the Serdes.Long() for value serdes. aggregate: inherit the key serdes, do not set for value serdes internally. ``` Although it's all good for reduce and count, it is quiet unsafe to have aggregate without Materialized given. In fact I don't see why we would not give a Materialized for the aggregate since the result type will always be different (otherwise use reduce) and also the value Serde is simply not propagated. This has been discussed previously in a broader PR before but I believe for aggregate we could pass implicitly a Materialized the same way we pass a Joined, just to avoid the stupid case. Then if the user wants to specialize, he can give his own Materialized. Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Matthias J. Sax	d166485be1	KAFKA-6054: Add 'version probing' to Kafka Streams rebalance (#4636 ) implements KIP-268 Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Guozhang Wang	f33e9a346e	KAFKA-4936: Add dynamic routing in Streams (#5018 ) implements KIP-303 Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>	7 years ago
Florian Hussonnois	14171fa8b4	KAFKA-6957 make InternalTopologyBuilder accessible from AbstractStream subclasses (#5085 ) Currently, the AbstractStream class defines a copy-constructor that allow to extend KStream and KTable APIs with new methods without impacting the public interface. However adding new processor or/and store to the topology is made throught the internalTopologyBuilder that is not accessible from AbstractStream subclasses defined outside of the package (package visibility). Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Dark	2b6630b518	Remove duplicate code which is invoked twice (#5039 ) Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Bill Bejeck	4943c3f2f7	MINOR: reduce commit time on test (#5095 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
John Roesler	c470ff70d3	KAFKA-5697; Implement new consumer poll API from KIP-266 (#4855 ) Add the new stricter-timeout version of `poll` proposed in KIP-266. The pre-existing variant `poll(long timeout)` would block indefinitely for metadata updates if they were needed, then it would issue a fetch and poll for `timeout` ms for new records. The initial indefinite metadata block caused applications to become stuck when the brokers became unavailable. The existence of the timeout parameter made the indefinite block especially unintuitive. This PR adds `poll(Duration timeout)` with the semantics: 1. iff a metadata update is needed: 1. send (asynchronous) metadata requests 2. poll for metadata responses (counts against timeout) - if no response within timeout, return an empty collection immediately 2. if there is fetch data available, return it immediately 3. if there is no fetch request in flight, send fetch requests 4. poll for fetch responses (counts against timeout) - if no response within timeout, return an empty collection (leaving async fetch request for the next poll) - if we get a response, return the response The old method, `poll(long timeout)` is deprecated, but we do not change its semantics, so it remains: 1. iff a metadata update is needed: 1. send (asynchronous) metadata requests 2. poll for metadata responses indefinitely until we get it 2. if there is fetch data available, return it immediately 3. if there is no fetch request in flight, send fetch requests 4. poll for fetch responses (counts against timeout) - if no response within timeout, return an empty collection (leaving async fetch request for the next poll) - if we get a response, return the response One notable usage is prohibited by the new `poll`: previously, you could call `poll(0)` to block for metadata updates, for example to initialize the client, supposedly without fetching records. Note, though, that this behavior is not according to any contract, and there is no guarantee that `poll(0)` won't return records the first time it's called. Therefore, it has always been unsafe to ignore the response.	7 years ago
Ismael Juma	7132a85fc3	KAFKA-6921; Remove old Scala producer and related code * Removed Scala producers, request classes, kafka.tools.ProducerPerformance, encoders, tests. * Updated ConsoleProducer to remove Scala producer support (removed `BaseProducer` and several options that are not used by the Java producer). * Updated a few Scala consumer tests to use the new producer (including a minor refactor of `produceMessages` methods in `TestUtils`). * Updated `ClientUtils.fetchTopicMetadata` to use `SimpleConsumer` instead of `SyncProducer`. * Removed `TestKafkaAppender` as it looks useless and it defined an `Encoder`. * Minor import clean-ups No new tests added since behaviour should remain the same after these changes. Author: Ismael Juma <ismael@juma.me.uk> Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Dong Lin <lindong28@gmail.com> Closes #5045 from ijuma/kafka-6921-remove-old-producer	7 years ago
Jorge Quilcate Otoya	133108cdac	KAFKA-6850: Add Record Header support to Kafka Streams Processor API (KIP-244) (#4955 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	7 years ago
Filipe Agapito	6281fbcb6a	KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 3] (#5052 ) * KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 3] * Refactor: - KStreamWindowReduceTest - KTableMapKeysTest - SessionWindowedKStreamImplTest - TimeWindowedKStreamImplTest * Remove unnecessary @SuppressWarnings(unchecked) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago
Guozhang Wang	70a506b983	MINOR: Ignore test_broker_type_bounce_at_start system test (#5055 ) test_broker_type_bounce_at_start tries to validate that when the controller is down, the streams client will always fail trying to create the topic; with the current behavior of admin client it is actually not always true: the actual behavior depends on the admin client internals as well as when the controller becomes unavailable during the leader assign partitions phase. I'd suggest at least ignore this test for now until the admin client has more stable (personally I'd even suggest removing this test as its coverage benefits is smaller than its introduced issues to me). Also adding a few more log4j entries as a result of investigating this issue. Reviewers: Matthias J. Sax <matthias@confluent.io>	7 years ago
Joan Goyeau	96cda0e07a	MINOR: Fix type inference on joins and aggregates (#5019 ) The type inference doesn't currently work for the join functions in Scala as it doesn't know yet the types of the given KStream[K, V] or KTable[K, V]. The fix here is to curry the joiner function. I personally prefer this notation but this also means it differs more from the Java API. I believe the diff with the Java API is worth in this case as it's not only solving the type inference but also fits better the Scala way of coding (ex: fold). Moreover any Scala dev will bug and spend little time on these functions trying to understand why the type inference is not working and then get frustrated to be obliged to be explicit here where it's not harmful to be inferred. Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, Ismael Juma <ismael@juma.me.uk>	7 years ago
Guozhang Wang	9752ccad55	KAFKA-6729: Follow up; disable logging for source KTable. (#5038 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	7 years ago
David Glasser	f65f3a878f	KAFKA-6905: Document that Transformers may be re-used by Streams (#5026 ) This is a follow-up to #5022 which added documentation to the Processor interface. This commit adds similar documentation to Transformer and ValueTransformer. Also, s/processor/transformer/ in the close() docs. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	7 years ago

1 2 3 4 5 ...

1008 Commits (2db7eb7a8c28f4f5d2550b45a9215948652f82ca)