mirror of https://github.com/apache/kafka.git
KAFKA-5839: Upgrade Guide doc changes for KIP-130
Author: Florian Hussonnois <florian.hussonnois@gmail.com> Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #3811 from fhussonnois/KAFKA-5839 minor fixes
This commit is contained in:
parent
bd0146d984
commit
398714b758
|
@ -2904,6 +2904,16 @@ Note that in the <code>WordCountProcessor</code> implementation, users need to r
|
|||
);
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
To retrieve information about the local running threads, you can use the <code>localThreadsMetadata()</code> method after you start the application.
|
||||
</p>
|
||||
|
||||
<pre class="brush: java;">
|
||||
// For instance, use this method to print/monitor the partitions assigned to each local tasks.
|
||||
Set<ThreadMetadata> threads = streams.localThreadsMetadata();
|
||||
...
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
To stop the application instance call the <code>close()</code> method:
|
||||
</p>
|
||||
|
|
|
@ -49,7 +49,7 @@
|
|||
|
||||
<p>
|
||||
With 1.0 a major API refactoring was accomplished and the new API is cleaner and easier to use.
|
||||
This change includes the five main classes <code>KafkaStreams<code>, <code>KStreamBuilder</code>,
|
||||
This change includes the five main classes <code>KafkaStreams</code>, <code>KStreamBuilder</code>,
|
||||
<code>KStream</code>, <code>KTable</code>, and <code>TopologyBuilder</code> (and some more others).
|
||||
All changes are fully backward compatible as old API is only deprecated but not removed.
|
||||
We recommend to move to the new API as soon as you can.
|
||||
|
@ -59,7 +59,7 @@
|
|||
<p>
|
||||
The two main classes to specify a topology via the DSL (<code>KStreamBuilder</code>)
|
||||
or the Processor API (<code>TopologyBuilder</code>) were deprecated and replaced by
|
||||
<code>StreamsBuilder</code> and <code>Topology<code> (both new classes are located in
|
||||
<code>StreamsBuilder</code> and <code>Topology</code> (both new classes are located in
|
||||
package <code>org.apache.kafka.streams</code>).
|
||||
Note, that <code>StreamsBuilder</code> does not extend <code>Topology</code>, i.e.,
|
||||
the class hierarchy is different now.
|
||||
|
@ -74,7 +74,7 @@
|
|||
</p>
|
||||
|
||||
<p>
|
||||
Changing how a topology is specified also affects <code>KafkaStreams<code> constructors,
|
||||
Changing how a topology is specified also affects <code>KafkaStreams</code> constructors,
|
||||
that now only accept a <code>Topology</code>.
|
||||
Using the DSL builder class <code>StreamsBuilder</code> one can get the constructed
|
||||
<code>Topology</code> via <code>StreamsBuilder#build()</code>.
|
||||
|
@ -86,24 +86,50 @@
|
|||
</p>
|
||||
|
||||
<p>
|
||||
With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
|
||||
you should no longer pass in <code>Serde</code> to <code>KStream#print</code> operations.
|
||||
If you can't rely on using <code>toString</code> to print your keys an values, you should instead you provide a custom <code>KeyValueMapper</code> via the <code>Printed#withKeyValueMapper</code> call.
|
||||
New methods in <code>KafkaStreams</code>:
|
||||
</p>
|
||||
<ul>
|
||||
<li> retrieve the current runtime information about the local threads via <code>#localThreadsMetadata()</code> </li>
|
||||
</ul>
|
||||
<p>
|
||||
Deprecated methods in <code>KafkaStreams</code>:
|
||||
</p>
|
||||
<ul>
|
||||
<li><code>toString()</code></li>
|
||||
<li><code>toString(final String indent)</code></li>
|
||||
</ul>
|
||||
<p>
|
||||
Previously the above methods were used to return static and runtime information.
|
||||
They have been deprecated in favor of using the new classes/methods <code>#localThreadsMetadata()</code> / <code>ThreadMetadata</code> (returning runtime information) and
|
||||
<code>TopologyDescription</code> / <code>Topology#describe()</code> (returning static information).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
More deprecated methods in <code>KafkaStreams</code>:
|
||||
</p>
|
||||
<ul>
|
||||
<li>With the introduction of <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines">KIP-182</a>
|
||||
you should no longer pass in <code>Serde</code> to <code>KStream#print</code> operations.
|
||||
If you can't rely on using <code>toString</code> to print your keys an values, you should instead you provide a custom <code>KeyValueMapper</code> via the <code>Printed#withKeyValueMapper</code> call.
|
||||
</li>
|
||||
<li>
|
||||
Windowed aggregations have moved from <code>KGroupedStream</code> to <code>WindowedKStream</code>.
|
||||
You can now perform a windowed aggregation by, for example, using <code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
|
||||
Note: the previous aggregate functions on <code>KGroupedStream</code> still work, but have been deprecated.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
Modified methods in <code>Processor</code>:
|
||||
</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p>
|
||||
The Processor API was extended to allow users to schedule <code>punctuate</code> functions either based on data-driven <b>stream time</b> or wall-clock time.
|
||||
As a result, the original <code>ProcessorContext#schedule</code> is deprecated with a new overloaded function that accepts a user customizable <code>Punctuator</code> callback interface, which triggers its <code>punctuate</code> API method periodically based on the <code>PunctuationType</code>.
|
||||
The <code>PunctuationType</code> determines what notion of time is used for the punctuation scheduling: either <a href="/{{version}}/documentation/streams/core-concepts#streams_time">stream time</a> or wall-clock time (by default, <b>stream time</b> is configured to represent event time via <code>TimestampExtractor</code>).
|
||||
In addition, the <code>punctuate</code> function inside <code>Processor</code> is also deprecated.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Before this, users could only schedule based on stream time (i.e. <code>PunctuationType.STREAM_TIME</code>) and hence the <code>punctuate</code> function was data-driven only because stream time is determined (and advanced forward) by the timestamps derived from the input data.
|
||||
If there is no data arriving at the processor, the stream time would not advance and hence punctuation will not be triggered.
|
||||
|
@ -113,6 +139,8 @@
|
|||
if these 60 records were processed within 5 seconds, then no <code>punctuate</code> would be called at all.
|
||||
Users can schedule multiple <code>Punctuator</code> callbacks with different <code>PunctuationType</code>s within the same processor by simply calling <code>ProcessorContext#schedule</code> multiple times inside processor's <code>init()</code> method.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
If you are monitoring on task level or processor-node / state store level Streams metrics, please note that the metrics sensor name and hierarchy was changed:
|
||||
|
|
Loading…
Reference in New Issue