mirror of https://github.com/apache/kafka.git
KAFKA-13455: Add steps to run Kafka Connect to quickstart (#11500)
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>
This commit is contained in:
parent
c1071327c5
commit
c0b2afb353
|
@ -161,12 +161,77 @@ This is my second event</code></pre>
|
||||||
You probably have lots of data in existing systems like relational databases or traditional messaging systems,
|
You probably have lots of data in existing systems like relational databases or traditional messaging systems,
|
||||||
along with many applications that already use these systems.
|
along with many applications that already use these systems.
|
||||||
<a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
|
<a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
|
||||||
data from external systems into Kafka, and vice versa. It is thus very easy to integrate existing systems with
|
data from external systems into Kafka, and vice versa. It is an extensible tool that runs
|
||||||
Kafka. To make this process even easier, there are hundreds of such connectors readily available.
|
<i>connectors</i>, which implement the custom logic for interacting with an external system.
|
||||||
|
It is thus very easy to integrate existing systems with Kafka. To make this process even easier,
|
||||||
|
there are hundreds of such connectors readily available.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>Take a look at the <a href="/documentation/#connect">Kafka Connect section</a>
|
<p>
|
||||||
to learn more about how to continuously import/export your data into and out of Kafka.</p>
|
In this quickstart we'll see how to run Kafka Connect with simple connectors that import data
|
||||||
|
from a file to a Kafka topic and export data from a Kafka topic to a file.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
First, we'll start by creating some seed data to test with:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> echo -e "foo\nbar" > test.txt</pre>
|
||||||
|
Or on Windows:
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> echo foo> test.txt
|
||||||
|
> echo bar>> test.txt</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
|
||||||
|
process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
|
||||||
|
process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
|
||||||
|
The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
|
||||||
|
class to instantiate, and any other configuration required by the connector.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
|
||||||
|
and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
|
||||||
|
and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
|
||||||
|
Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and
|
||||||
|
producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code>
|
||||||
|
and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline
|
||||||
|
by examining the contents of the output file:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> more test.sink.txt
|
||||||
|
foo
|
||||||
|
bar</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the
|
||||||
|
data in the topic (or use custom consumer code to process it):
|
||||||
|
</p>
|
||||||
|
|
||||||
|
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
|
||||||
|
{"schema":{"type":"string","optional":false},"payload":"foo"}
|
||||||
|
{"schema":{"type":"string","optional":false},"payload":"bar"}
|
||||||
|
...</pre>
|
||||||
|
|
||||||
|
<p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p>
|
||||||
|
|
||||||
|
<pre class="brush: bash;">
|
||||||
|
> echo Another line>> test.txt</pre>
|
||||||
|
|
||||||
|
<p>You should see the line appear in the console consumer output and in the sink file.</p>
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue