KAFKA-13455: Add steps to run Kafka Connect to quickstart (#11500)

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>
2021-11-22 13:41:24 +00:00 · 2021-11-22 13:41:24 +00:00 · c0b2afb353
parent c1071327c5
commit c0b2afb353
1 changed files with 69 additions and 4 deletions
--- a/docs/quickstart.html
+++ b/docs/quickstart.html
@ -161,12 +161,77 @@ This is my second event</code></pre>
            You probably have lots of data in existing systems like relational databases or traditional messaging systems,
            along with many applications that already use these systems.
            <a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
-            data from external systems into Kafka, and vice versa.  It is thus very easy to integrate existing systems with
+            data from external systems into Kafka, and vice versa. It is an extensible tool that runs
-            Kafka. To make this process even easier, there are hundreds of such connectors readily available.
+            <i>connectors</i>, which implement the custom logic for interacting with an external system.
            It is thus very easy to integrate existing systems with Kafka. To make this process even easier,
            there are hundreds of such connectors readily available.
        </p>
-        <p>Take a look at the <a href="/documentation/#connect">Kafka Connect section</a>
+        <p>
-            to learn more about how to continuously import/export your data into and out of Kafka.</p>
+            In this quickstart we'll see how to run Kafka Connect with simple connectors that import data
            from a file to a Kafka topic and export data from a Kafka topic to a file.
        </p>
        <p>
            First, we'll start by creating some seed data to test with:
        </p>
        <pre class="brush: bash;">
 &gt; echo -e "foo\nbar" > test.txt</pre>
        Or on Windows:
        <pre class="brush: bash;">
 &gt; echo foo> test.txt
 &gt; echo bar>> test.txt</pre>
        <p>
            Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
            process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
            process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
            The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
            class to instantiate, and any other configuration required by the connector.
        </p>
        <pre class="brush: bash;">
 &gt; bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</pre>
        <p>
            These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
            and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
            and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.
        </p>
        <p>
            During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
            Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and
            producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code>
            and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline
            by examining the contents of the output file:
        </p>
        <pre class="brush: bash;">
 &gt; more test.sink.txt
 foo
 bar</pre>
        <p>
            Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the
            data in the topic (or use custom consumer code to process it):
        </p>
        <pre class="brush: bash;">
 &gt; bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
 {"schema":{"type":"string","optional":false},"payload":"foo"}
 {"schema":{"type":"string","optional":false},"payload":"bar"}
 ...</pre>
        <p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p>
        <pre class="brush: bash;">
 &gt; echo Another line>> test.txt</pre>
        <p>You should see the line appear in the console consumer output and in the sink file.</p>
    </div>