diff --git a/docs/js/templateData.js b/docs/js/templateData.js
index 3eca71ede69..50997bd6b26 100644
--- a/docs/js/templateData.js
+++ b/docs/js/templateData.js
@@ -20,4 +20,5 @@ var context={
"version": "0110",
"dotVersion": "0.11.0",
"fullDotVersion": "0.11.0.0"
+ "scalaVersion:" "2.11"
};
diff --git a/docs/streams/quickstart.html b/docs/streams/quickstart.html
index 1c45e16a729..031a37516ec 100644
--- a/docs/streams/quickstart.html
+++ b/docs/streams/quickstart.html
@@ -40,10 +40,10 @@ of the
As the first step, we will start Kafka (unless you already have it started) and then we will
prepare input data to a Kafka topic, which will subsequently be processed by a Kafka Streams application.
+
-> tar -xzf kafka_2.11-{{fullDotVersion}}.tgz -> cd kafka_2.11-{{fullDotVersion}} +> tar -xzf kafka_{{scalaVersion}}-{{fullDotVersion}}.tgz +> cd kafka_{{scalaVersion}}-{{fullDotVersion}}- +
@@ -102,19 +104,9 @@ Kafka uses ZooKeeper so you need to -
> echo -e "all streams lead to kafka\nhello kafka streams\njoin kafka summit" > file-input.txt @@ -126,41 +118,59 @@ Or on Windows: > echo|set /p=join kafka summit>> file-input.txt-
-Next, we send this input data to the input topic named streams-file-input using the console producer, -which reads the data from STDIN line-by-line, and publishes each line as a separate Kafka message with null key and value encoded a string to the topic (in practice, -stream data will likely be flowing continuously into Kafka where the application will be up and running): -
+--> + +Next, we create the input topic named streams-wordcount-input and the output topic named streams-wordcount-output:> bin/kafka-topics.sh --create \ --zookeeper localhost:2181 \ --replication-factor 1 \ --partitions 1 \ - --topic streams-file-input + --topic streams-wordcount-input +Created topic "streams-wordcount-input". + +> bin/kafka-topics.sh --create \ + --zookeeper localhost:2181 \ + --replication-factor 1 \ + --partitions 1 \ + --topic streams-wordcount-output +Created topic "streams-wordcount-output".+The created topic can be described with the same kafka-topics tool:
-> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-file-input < file-input.txt +> bin/kafka-topics.sh --zookeeper localhost:2181 --describe + +Topic:streams-wordcount-input PartitionCount:1 ReplicationFactor:1 Configs: + Topic: streams-wordcount-input Partition: 0 Leader: 0 Replicas: 0 Isr: 0 +Topic:streams-wordcount-output PartitionCount:1 ReplicationFactor:1 Configs: + Topic: streams-wordcount-output Partition: 0 Leader: 0 Replicas: 0 Isr: 0-
> bin/kafka-run-class.sh org.apache.kafka.streams.examples.wordcount.WordCountDemo
-The demo application will read from the input topic streams-file-input, perform the computations of the WordCount algorithm on each of the read messages, +The demo application will read from the input topic streams-wordcount-input, perform the computations of the WordCount algorithm on each of the read messages, and continuously write its current results to the output topic streams-wordcount-output. Hence there won't be any STDOUT output except log entries as the results are written back into in Kafka. -The demo will run for a few seconds and then, unlike typical stream processing applications, terminate automatically. -
--We can now inspect the output of the WordCount demo application by reading from its output topic:
+Now we can start the console producer in a separate terminal to write some input data to this topic: + ++> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-wordcount-input ++ +and inspect the output of the WordCount demo application by reading from its output topic with the console consumer in a separate terminal: +
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \ --topic streams-wordcount-output \ @@ -172,27 +182,115 @@ We can now inspect the output of the WordCount demo application by reading from --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer-
-with the following output data being printed to the console: -
+ +-all 1 -lead 1 -to 1 -hello 1 -streams 2 -join 1 -kafka 3 -summit 1 +> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-wordcount-input +all streams lead to kafka
-Here, the first column is the Kafka message key in java.lang.String
format, and the second column is the message value in java.lang.Long
format.
-Note that the output is actually a continuous stream of updates, where each data record (i.e. each line in the original output above) is
-an updated count of a single word, aka record key such as "kafka". For multiple records with the same key, each later record is an update of the previous one.
+This message will be processed by the Wordcount application and the following output data will be written to the streams-wordcount-output topic and printed by the console consumer:
+> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 + --topic streams-wordcount-output \ + --from-beginning \ + --formatter kafka.tools.DefaultMessageFormatter \ + --property print.key=true \ + --property print.value=true \ + --property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \ + --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer + +all 1 +streams 1 +lead 1 +to 1 +kafka 1 ++ +
+Here, the first column is the Kafka message key in java.lang.String
format and represents a word that is being counted, and the second column is the message value in java.lang.Long
format, representing the word's latest count.
+
+> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-wordcount-input +all streams lead to kafka +hello kafka streams ++ +In your other terminal in which the console consumer is running, you will observe that the WordCount application wrote new output data: + +
+> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 + --topic streams-wordcount-output \ + --from-beginning \ + --formatter kafka.tools.DefaultMessageFormatter \ + --property print.key=true \ + --property print.value=true \ + --property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \ + --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer + +all 1 +streams 1 +lead 1 +to 1 +kafka 1 +hello 1 +kafka 2 +streams 2 ++ +Here the last printed lines kafka 2 and streams 2 indicate updates to the keys kafka and streams whose counts have been incremented from 1 to 2. +Whenever you write further input messages to the input topic, you will observe new messages being added to the streams-wordcount-output topic, +representing the most recent word counts as computed by the WordCount application. +Let's enter one final input text line "join kafka summit" and hit <RETURN> in the console producer to the input topic streams-wordcount-input before we wrap up this quickstart: + +
+> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic streams-wordcount-input +all streams lead to kafka +hello kafka streams +join kafka summit ++ +The streams-wordcount-output topic will subsequently show the corresponding updated word counts (see last three lines): + +
+> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 + --topic streams-wordcount-output \ + --from-beginning \ + --formatter kafka.tools.DefaultMessageFormatter \ + --property print.key=true \ + --property print.value=true \ + --property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \ + --property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer + +all 1 +streams 1 +lead 1 +to 1 +kafka 1 +hello 1 +kafka 2 +streams 2 +join 1 +kafka 3 +summit 1 ++ +As one can see, outputs of the Wordcount application is actually a continuous stream of updates, where each output record (i.e. each line in the original output above) is +an updated count of a single word, aka record key such as "kafka". For multiple records with the same key, each later record is an update of the previous one. +
The two diagrams below illustrate what is essentially happening behind the scenes.
The first column shows the evolution of the current state of the KTable<String, Long>
that is counting word occurrences for count
.
@@ -217,13 +315,9 @@ And so on (we skip the illustration of how the third line is being processed). T
Looking beyond the scope of this concrete example, what Kafka Streams is doing here is to leverage the duality between a table and a changelog stream (here: table = the KTable, changelog stream = the downstream KStream): you can publish every change of the table to a stream, and if you consume the entire changelog stream from beginning to end, you can reconstruct the contents of the table.
-Now you can write more input messages to the streams-file-input topic and observe additional messages added -to streams-wordcount-output topic, reflecting updated word counts (e.g., using the console producer and the -console consumer, as described above). -
+You can stop the console consumer via Ctrl-C.
+You can now stop the console consumer, the console producer, the Wordcount application, the Kafka broker and the Zookeeper server in order via Ctrl-C.