mirror of https://github.com/apache/kafka.git
KAFKA-13455: Add steps to run Kafka Connect to quickstart (#11500)
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>
This commit is contained in:
parent
c1071327c5
commit
c0b2afb353
|
@ -161,12 +161,77 @@ This is my second event</code></pre>
|
|||
You probably have lots of data in existing systems like relational databases or traditional messaging systems,
|
||||
along with many applications that already use these systems.
|
||||
<a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
|
||||
data from external systems into Kafka, and vice versa. It is thus very easy to integrate existing systems with
|
||||
Kafka. To make this process even easier, there are hundreds of such connectors readily available.
|
||||
data from external systems into Kafka, and vice versa. It is an extensible tool that runs
|
||||
<i>connectors</i>, which implement the custom logic for interacting with an external system.
|
||||
It is thus very easy to integrate existing systems with Kafka. To make this process even easier,
|
||||
there are hundreds of such connectors readily available.
|
||||
</p>
|
||||
|
||||
<p>Take a look at the <a href="/documentation/#connect">Kafka Connect section</a>
|
||||
to learn more about how to continuously import/export your data into and out of Kafka.</p>
|
||||
<p>
|
||||
In this quickstart we'll see how to run Kafka Connect with simple connectors that import data
|
||||
from a file to a Kafka topic and export data from a Kafka topic to a file.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
First, we'll start by creating some seed data to test with:
|
||||
</p>
|
||||
|
||||
<pre class="brush: bash;">
|
||||
> echo -e "foo\nbar" > test.txt</pre>
|
||||
Or on Windows:
|
||||
<pre class="brush: bash;">
|
||||
> echo foo> test.txt
|
||||
> echo bar>> test.txt</pre>
|
||||
|
||||
<p>
|
||||
Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
|
||||
process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
|
||||
process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
|
||||
The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
|
||||
class to instantiate, and any other configuration required by the connector.
|
||||
</p>
|
||||
|
||||
<pre class="brush: bash;">
|
||||
> bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</pre>
|
||||
|
||||
<p>
|
||||
These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
|
||||
and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
|
||||
and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
|
||||
Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and
|
||||
producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code>
|
||||
and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline
|
||||
by examining the contents of the output file:
|
||||
</p>
|
||||
|
||||
|
||||
<pre class="brush: bash;">
|
||||
> more test.sink.txt
|
||||
foo
|
||||
bar</pre>
|
||||
|
||||
<p>
|
||||
Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the
|
||||
data in the topic (or use custom consumer code to process it):
|
||||
</p>
|
||||
|
||||
|
||||
<pre class="brush: bash;">
|
||||
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
|
||||
{"schema":{"type":"string","optional":false},"payload":"foo"}
|
||||
{"schema":{"type":"string","optional":false},"payload":"bar"}
|
||||
...</pre>
|
||||
|
||||
<p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p>
|
||||
|
||||
<pre class="brush: bash;">
|
||||
> echo Another line>> test.txt</pre>
|
||||
|
||||
<p>You should see the line appear in the console consumer output and in the sink file.</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
|
Loading…
Reference in New Issue