Browse Source

KAFKA-5839: Upgrade Guide doc changes for KIP-130

Author: Florian Hussonnois <florian.hussonnois@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3811 from fhussonnois/KAFKA-5839

minor fixes
pull/3758/merge
Florian Hussonnois 7 years ago committed by Guozhang Wang
parent
commit
398714b758
  1. 10
      docs/streams/developer-guide.html
  2. 44
      docs/streams/upgrade-guide.html

10
docs/streams/developer-guide.html

@ -2904,6 +2904,16 @@ Note that in the <code>WordCountProcessor</code> implementation, users need to r @@ -2904,6 +2904,16 @@ Note that in the <code>WordCountProcessor</code> implementation, users need to r
);
</pre>
<p>
To retrieve information about the local running threads, you can use the <code>localThreadsMetadata()</code> method after you start the application.
</p>
<pre class="brush: java;">
// For instance, use this method to print/monitor the partitions assigned to each local tasks.
Set&lt;ThreadMetadata&gt; threads = streams.localThreadsMetadata();
...
</pre>
<p>
To stop the application instance call the <code>close()</code> method:
</p>

44
docs/streams/upgrade-guide.html

@ -49,7 +49,7 @@ @@ -49,7 +49,7 @@
<p>
With 1.0 a major API refactoring was accomplished and the new API is cleaner and easier to use.
This change includes the five main classes <code>KafkaStreams<code>, <code>KStreamBuilder</code>,
This change includes the five main classes <code>KafkaStreams</code>, <code>KStreamBuilder</code>,
<code>KStream</code>, <code>KTable</code>, and <code>TopologyBuilder</code> (and some more others).
All changes are fully backward compatible as old API is only deprecated but not removed.
We recommend to move to the new API as soon as you can.
@ -59,7 +59,7 @@ @@ -59,7 +59,7 @@
<p>
The two main classes to specify a topology via the DSL (<code>KStreamBuilder</code>)
or the Processor API (<code>TopologyBuilder</code>) were deprecated and replaced by
<code>StreamsBuilder</code> and <code>Topology<code> (both new classes are located in
<code>StreamsBuilder</code> and <code>Topology</code> (both new classes are located in
package <code>org.apache.kafka.streams</code>).
Note, that <code>StreamsBuilder</code> does not extend <code>Topology</code>, i.e.,
the class hierarchy is different now.
@ -74,7 +74,7 @@ @@ -74,7 +74,7 @@
</p>
<p>
Changing how a topology is specified also affects <code>KafkaStreams<code> constructors,
Changing how a topology is specified also affects <code>KafkaStreams</code> constructors,
that now only accept a <code>Topology</code>.
Using the DSL builder class <code>StreamsBuilder</code> one can get the constructed
<code>Topology</code> via <code>StreamsBuilder#build()</code>.
@ -86,24 +86,50 @@ @@ -86,24 +86,50 @@
</p>
<p>
With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
you should no longer pass in <code>Serde</code> to <code>KStream#print</code> operations.
If you can't rely on using <code>toString</code> to print your keys an values, you should instead you provide a custom <code>KeyValueMapper</code> via the <code>Printed#withKeyValueMapper</code> call.
New methods in <code>KafkaStreams</code>:
</p>
<ul>
<li> retrieve the current runtime information about the local threads via <code>#localThreadsMetadata()</code> </li>
</ul>
<p>
Deprecated methods in <code>KafkaStreams</code>:
</p>
<ul>
<li><code>toString()</code></li>
<li><code>toString(final String indent)</code></li>
</ul>
<p>
Previously the above methods were used to return static and runtime information.
They have been deprecated in favor of using the new classes/methods <code>#localThreadsMetadata()</code> / <code>ThreadMetadata</code> (returning runtime information) and
<code>TopologyDescription</code> / <code>Topology#describe()</code> (returning static information).
</p>
<p>
More deprecated methods in <code>KafkaStreams</code>:
</p>
<ul>
<li>With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
you should no longer pass in <code>Serde</code> to <code>KStream#print</code> operations.
If you can't rely on using <code>toString</code> to print your keys an values, you should instead you provide a custom <code>KeyValueMapper</code> via the <code>Printed#withKeyValueMapper</code> call.
</li>
<li>
Windowed aggregations have moved from <code>KGroupedStream</code> to <code>WindowedKStream</code>.
You can now perform a windowed aggregation by, for example, using <code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
Note: the previous aggregate functions on <code>KGroupedStream</code> still work, but have been deprecated.
</p>
</li>
</ul>
<p>
Modified methods in <code>Processor</code>:
</p>
<ul>
<li>
<p>
The Processor API was extended to allow users to schedule <code>punctuate</code> functions either based on data-driven <b>stream time</b> or wall-clock time.
As a result, the original <code>ProcessorContext#schedule</code> is deprecated with a new overloaded function that accepts a user customizable <code>Punctuator</code> callback interface, which triggers its <code>punctuate</code> API method periodically based on the <code>PunctuationType</code>.
The <code>PunctuationType</code> determines what notion of time is used for the punctuation scheduling: either <a href="/{{version}}/documentation/streams/core-concepts#streams_time">stream time</a> or wall-clock time (by default, <b>stream time</b> is configured to represent event time via <code>TimestampExtractor</code>).
In addition, the <code>punctuate</code> function inside <code>Processor</code> is also deprecated.
</p>
<p>
Before this, users could only schedule based on stream time (i.e. <code>PunctuationType.STREAM_TIME</code>) and hence the <code>punctuate</code> function was data-driven only because stream time is determined (and advanced forward) by the timestamps derived from the input data.
If there is no data arriving at the processor, the stream time would not advance and hence punctuation will not be triggered.
@ -113,6 +139,8 @@ @@ -113,6 +139,8 @@
if these 60 records were processed within 5 seconds, then no <code>punctuate</code> would be called at all.
Users can schedule multiple <code>Punctuator</code> callbacks with different <code>PunctuationType</code>s within the same processor by simply calling <code>ProcessorContext#schedule</code> multiple times inside processor's <code>init()</code> method.
</p>
</li>
</ul>
<p>
If you are monitoring on task level or processor-node / state store level Streams metrics, please note that the metrics sensor name and hierarchy was changed:

Loading…
Cancel
Save