KAFKA-13455: Add steps to run Kafka Connect to quickstart (#11500)

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>
2021-11-22 13:41:24 +00:00 · 2021-11-22 13:41:24 +00:00 · c0b2afb353
parent c1071327c5
commit c0b2afb353
1 changed files with 69 additions and 4 deletions
--- a/docs/quickstart.html
+++ b/docs/quickstart.html
@ -161,12 +161,77 @@ This is my second event</code></pre>
            You probably have lots of data in existing systems like relational databases or traditional messaging systems,
            along with many applications that already use these systems.
            <a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
-            data from external systems into Kafka, and vice versa.  It is thus very easy to integrate existing systems with
-            Kafka. To make this process even easier, there are hundreds of such connectors readily available.
+            data from external systems into Kafka, and vice versa. It is an extensible tool that runs
+            <i>connectors</i>, which implement the custom logic for interacting with an external system.
+            It is thus very easy to integrate existing systems with Kafka. To make this process even easier,
+            there are hundreds of such connectors readily available.
        </p>

-        <p>Take a look at the <a href="/documentation/#connect">Kafka Connect section</a>
-            to learn more about how to continuously import/export your data into and out of Kafka.</p>
+        <p>
+            In this quickstart we'll see how to run Kafka Connect with simple connectors that import data
+            from a file to a Kafka topic and export data from a Kafka topic to a file.
+        </p>
+
+        <p>
+            First, we'll start by creating some seed data to test with:
+        </p>
+
+        <pre class="brush: bash;">
+&gt; echo -e "foo\nbar" > test.txt</pre>
+        Or on Windows:
+        <pre class="brush: bash;">
+&gt; echo foo> test.txt
+&gt; echo bar>> test.txt</pre>
+
+        <p>
+            Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
+            process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
+            process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
+            The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
+            class to instantiate, and any other configuration required by the connector.
+        </p>
+
+        <pre class="brush: bash;">
+&gt; bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</pre>
+
+        <p>
+            These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
+            and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
+            and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.
+        </p>
+
+        <p>
+            During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
+            Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and
+            producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code>
+            and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline
+            by examining the contents of the output file:
+        </p>
+
+
+        <pre class="brush: bash;">
+&gt; more test.sink.txt
+foo
+bar</pre>
+
+        <p>
+            Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the
+            data in the topic (or use custom consumer code to process it):
+        </p>
+
+
+        <pre class="brush: bash;">
+&gt; bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
+{"schema":{"type":"string","optional":false},"payload":"foo"}
+{"schema":{"type":"string","optional":false},"payload":"bar"}
+...</pre>
+
+        <p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p>
+
+        <pre class="brush: bash;">
+&gt; echo Another line>> test.txt</pre>
+
+        <p>You should see the line appear in the console consumer output and in the sink file.</p>

    </div>