Browse Source

MINOR: Switch order of sections on tumbling and hopping windows in streams doc. Tumbling windows are defined as "special case of hopping time windows" - but hopping windows currently only explained later in the docs. (#8505)

Currently, tumbling windows are defined as "a special case of hopping time windows" in the streams docs, but hopping windows are only explained in a subsequent section.
I think it would make sense to switch the order of these paragraphs around. To me this also makes more sense semantically.

Testing
Built the site and checked that everything looks ok and html is valid (or at least didn't contain any new warnings that were caused by this change).

Reviewers: Bill Bejeck <bbejeck@apache.org>
pull/8512/head
Sönke Liebau 5 years ago committed by GitHub
parent
commit
cc4e3aa302
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 76
      docs/streams/developer-guide/dsl-api.html

76
docs/streams/developer-guide/dsl-api.html

@ -55,8 +55,8 @@ @@ -55,8 +55,8 @@
</ul>
</li>
<li><a class="reference internal" href="#windowing" id="id19">Windowing</a><ul>
<li><a class="reference internal" href="#tumbling-time-windows" id="id20">Tumbling time windows</a></li>
<li><a class="reference internal" href="#hopping-time-windows" id="id21">Hopping time windows</a></li>
<li><a class="reference internal" href="#tumbling-time-windows" id="id20">Tumbling time windows</a></li>
<li><a class="reference internal" href="#sliding-time-windows" id="id22">Sliding time windows</a></li>
<li><a class="reference internal" href="#session-windows" id="id23">Session Windows</a></li>
<li><a class="reference internal" href="#window-final-results" id="id31">Window Final Results</a></li>
@ -3179,13 +3179,13 @@ @@ -3179,13 +3179,13 @@
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td><a class="reference internal" href="#windowing-tumbling"><span class="std std-ref">Tumbling time window</span></a></td>
<tr class="row-even"><td><a class="reference internal" href="#windowing-hopping"><span class="std std-ref">Hopping time window</span></a></td>
<td>Time-based</td>
<td>Fixed-size, non-overlapping, gap-less windows</td>
<td>Fixed-size, overlapping windows</td>
</tr>
<tr class="row-odd"><td><a class="reference internal" href="#windowing-hopping"><span class="std std-ref">Hopping time window</span></a></td>
<tr class="row-odd"><td><a class="reference internal" href="#windowing-tumbling"><span class="std std-ref">Tumbling time window</span></a></td>
<td>Time-based</td>
<td>Fixed-size, overlapping windows</td>
<td>Fixed-size, non-overlapping, gap-less windows</td>
</tr>
<tr class="row-even"><td><a class="reference internal" href="#windowing-sliding"><span class="std std-ref">Sliding time window</span></a></td>
<td>Time-based</td>
@ -3197,39 +3197,6 @@ @@ -3197,39 +3197,6 @@
</tr>
</tbody>
</table>
<div class="section" id="tumbling-time-windows">
<span id="windowing-tumbling"></span><h5><a class="toc-backref" href="#id20">Tumbling time windows</a><a class="headerlink" href="#tumbling-time-windows" title="Permalink to this headline"></a></h5>
<p>Tumbling time windows are a special case of hopping time windows and, like the latter, are windows based on time
intervals. They model fixed-size, non-overlapping, gap-less windows.
A tumbling window is defined by a single property: the window&#8217;s <em>size</em>.
A tumbling window is a hopping window whose window size is equal to its advance interval.
Since tumbling windows never overlap, a data record will belong to one and only one window.</p>
<div class="figure align-center" id="id3">
<img class="centered" src="/{{version}}/images/streams-time-windows-tumbling.png">
<p class="caption"><span class="caption-text">This diagram shows windowing a stream of data records with tumbling windows. Windows do not overlap because, by
definition, the advance interval is identical to the window size. In this diagram the time numbers represent minutes;
e.g. t=5 means &#8220;at the five-minute mark&#8221;. In reality, the unit of time in Kafka Streams is milliseconds, which means
the time numbers would need to be multiplied with 60 * 1,000 to convert from minutes to milliseconds (e.g. t=5 would
become t=300,000).</span></p>
</div>
<p>Tumbling time windows are <em>aligned to the epoch</em>, with the lower interval bound being inclusive and the upper bound
being exclusive. &#8220;Aligned to the epoch&#8221; means that the first window starts at timestamp zero. For example, tumbling
windows with a size of 5000ms have predictable window boundaries <code class="docutils literal"><span class="pre">[0;5000),[5000;10000),...</span></code> &#8212; and <strong>not</strong>
<code class="docutils literal"><span class="pre">[1000;6000),[6000;11000),...</span></code> or even something &#8220;random&#8221; like <code class="docutils literal"><span class="pre">[1452;6452),[6452;11452),...</span></code>.</p>
<p>The following code defines a tumbling window with a size of 5 minutes:</p>
<div class="highlight-java"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">java.time.Duration</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.kafka.streams.kstream.TimeWindows</span><span class="o">;</span>
<span class="c1">// A tumbling time window with a size of 5 minutes (and, by definition, an implicit</span>
<span class="c1">// advance interval of 5 minutes).</span>
<span class="kt">Duration</span> <span class="n">windowSizeMs</span> <span class="o">=</span> <span class="n">Duration</span><span class="o">.</span><span class="na">ofMinutes</span><span class="o">(</span><span class="mi">5</span><span class="o">);</span>
<span class="n">TimeWindows</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">);</span>
<span class="c1">// The above is equivalent to the following code:</span>
<span class="n">TimeWindows</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">).</span><span class="na">advanceBy</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">);</span>
</pre></div>
</div>
</div>
<div class="section" id="hopping-time-windows">
<span id="windowing-hopping"></span><h5><a class="toc-backref" href="#id21">Hopping time windows</a><a class="headerlink" href="#hopping-time-windows" title="Permalink to this headline"></a></h5>
<p>Hopping time windows are windows based on time intervals. They model fixed-sized, (possibly) overlapping windows.
@ -3271,6 +3238,39 @@ milliseconds (e.g. t=5 would become t=300,000).</span></p> @@ -3271,6 +3238,39 @@ milliseconds (e.g. t=5 would become t=300,000).</span></p>
corresponding window instance and the embedded key can be retrieved as <code class="docutils literal"><span class="pre">Windowed#window()</span></code> and <code class="docutils literal"><span class="pre">Windowed#key()</span></code>,
respectively.</p>
</div>
<div class="section" id="tumbling-time-windows">
<span id="windowing-tumbling"></span><h5><a class="toc-backref" href="#id20">Tumbling time windows</a><a class="headerlink" href="#tumbling-time-windows" title="Permalink to this headline"></a></h5>
<p>Tumbling time windows are a special case of hopping time windows and, like the latter, are windows based on time
intervals. They model fixed-size, non-overlapping, gap-less windows.
A tumbling window is defined by a single property: the window&#8217;s <em>size</em>.
A tumbling window is a hopping window whose window size is equal to its advance interval.
Since tumbling windows never overlap, a data record will belong to one and only one window.</p>
<div class="figure align-center" id="id3">
<img class="centered" src="/{{version}}/images/streams-time-windows-tumbling.png">
<p class="caption"><span class="caption-text">This diagram shows windowing a stream of data records with tumbling windows. Windows do not overlap because, by
definition, the advance interval is identical to the window size. In this diagram the time numbers represent minutes;
e.g. t=5 means &#8220;at the five-minute mark&#8221;. In reality, the unit of time in Kafka Streams is milliseconds, which means
the time numbers would need to be multiplied with 60 * 1,000 to convert from minutes to milliseconds (e.g. t=5 would
become t=300,000).</span></p>
</div>
<p>Tumbling time windows are <em>aligned to the epoch</em>, with the lower interval bound being inclusive and the upper bound
being exclusive. &#8220;Aligned to the epoch&#8221; means that the first window starts at timestamp zero. For example, tumbling
windows with a size of 5000ms have predictable window boundaries <code class="docutils literal"><span class="pre">[0;5000),[5000;10000),...</span></code> &#8212; and <strong>not</strong>
<code class="docutils literal"><span class="pre">[1000;6000),[6000;11000),...</span></code> or even something &#8220;random&#8221; like <code class="docutils literal"><span class="pre">[1452;6452),[6452;11452),...</span></code>.</p>
<p>The following code defines a tumbling window with a size of 5 minutes:</p>
<div class="highlight-java"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">java.time.Duration</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.kafka.streams.kstream.TimeWindows</span><span class="o">;</span>
<span class="c1">// A tumbling time window with a size of 5 minutes (and, by definition, an implicit</span>
<span class="c1">// advance interval of 5 minutes).</span>
<span class="kt">Duration</span> <span class="n">windowSizeMs</span> <span class="o">=</span> <span class="n">Duration</span><span class="o">.</span><span class="na">ofMinutes</span><span class="o">(</span><span class="mi">5</span><span class="o">);</span>
<span class="n">TimeWindows</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">);</span>
<span class="c1">// The above is equivalent to the following code:</span>
<span class="n">TimeWindows</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">).</span><span class="na">advanceBy</span><span class="o">(</span><span class="n">windowSizeMs</span><span class="o">);</span>
</pre></div>
</div>
</div>
<div class="section" id="sliding-time-windows">
<span id="windowing-sliding"></span><h5><a class="toc-backref" href="#id22">Sliding time windows</a><a class="headerlink" href="#sliding-time-windows" title="Permalink to this headline"></a></h5>
<p>Sliding windows are actually quite different from hopping and tumbling windows. In Kafka Streams, sliding windows

Loading…
Cancel
Save