mirror of https://github.com/apache/kafka.git
355 lines
17 KiB
HTML
355 lines
17 KiB
HTML
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
this work for additional information regarding copyright ownership.
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
(the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
-->
|
|
|
|
<script>
|
|
<!--#include virtual="js/templateData.js" -->
|
|
</script>
|
|
|
|
<script id="quickstart-template" type="text/x-handlebars-template">
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_download" href="#quickstart_download"></a>
|
|
<a href="#quickstart_download">Step 1: Get Kafka</a>
|
|
</h4>
|
|
|
|
<p>
|
|
<a href="https://www.apache.org/dyn/closer.cgi?path=/kafka/{{fullDotVersion}}/kafka_{{scalaVersion}}-{{fullDotVersion}}.tgz">Download</a>
|
|
the latest Kafka release and extract it:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ tar -xzf kafka_{{scalaVersion}}-{{fullDotVersion}}.tgz
|
|
$ cd kafka_{{scalaVersion}}-{{fullDotVersion}}</code></pre>
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_startserver" href="#quickstart_startserver"></a>
|
|
<a href="#quickstart_startserver">Step 2: Start the Kafka environment</a>
|
|
</h4>
|
|
|
|
<p class="note">NOTE: Your local environment must have Java 17+ installed.</p>
|
|
|
|
<p>Kafka can be run using local scripts and downloaded files or the docker image.</p>
|
|
|
|
<h5>Using downloaded files</h5>
|
|
|
|
<p>Generate a Cluster UUID</p>
|
|
<pre><code class="language-bash">$ KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"</code></pre>
|
|
|
|
<p>Format Log Directories</p>
|
|
<pre><code class="language-bash">$ bin/kafka-storage.sh format --standalone -t $KAFKA_CLUSTER_ID -c config/server.properties</code></pre>
|
|
|
|
<p>Start the Kafka Server</p>
|
|
<pre><code class="language-bash">$ bin/kafka-server-start.sh config/server.properties</code></pre>
|
|
|
|
<p>Once the Kafka server has successfully launched, you will have a basic Kafka environment running and ready to use.</p>
|
|
|
|
<h5>Using JVM Based Apache Kafka Docker Image</h5>
|
|
|
|
<p> Get the Docker image:</p>
|
|
<pre><code class="language-bash">$ docker pull apache/kafka:{{fullDotVersion}}</code></pre>
|
|
|
|
<p> Start the Kafka Docker container: </p>
|
|
<pre><code class="language-bash">$ docker run -p 9092:9092 apache/kafka:{{fullDotVersion}}</code></pre>
|
|
|
|
<h5>Using GraalVM Based Native Apache Kafka Docker Image</h5>
|
|
|
|
<p>Get the Docker image:</p>
|
|
<pre><code class="language-bash">$ docker pull apache/kafka-native:{{fullDotVersion}}</code></pre>
|
|
|
|
<p>Start the Kafka Docker container:</p>
|
|
<pre><code class="language-bash">$ docker run -p 9092:9092 apache/kafka-native:{{fullDotVersion}}</code></pre>
|
|
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_createtopic" href="#quickstart_createtopic"></a>
|
|
<a href="#quickstart_createtopic">Step 3: Create a topic to store your events</a>
|
|
</h4>
|
|
|
|
<p>
|
|
Kafka is a distributed <em>event streaming platform</em> that lets you read, write, store, and process
|
|
<a href="/documentation/#messages"><em>events</em></a> (also called <em>records</em> or
|
|
<em>messages</em> in the documentation)
|
|
across many machines.
|
|
</p>
|
|
|
|
<p>
|
|
Example events are payment transactions, geolocation updates from mobile phones, shipping orders, sensor measurements
|
|
from IoT devices or medical equipment, and much more. These events are organized and stored in
|
|
<a href="/documentation/#intro_concepts_and_terms"><em>topics</em></a>.
|
|
Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder.
|
|
</p>
|
|
|
|
<p>
|
|
So before you can write your first events, you must create a topic. Open another terminal session and run:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092</code></pre>
|
|
|
|
<p>
|
|
All of Kafka's command line tools have additional options: run the <code>kafka-topics.sh</code> command without any
|
|
arguments to display usage information. For example, it can also show you
|
|
<a href="/documentation/#intro_concepts_and_terms">details such as the partition count</a>
|
|
of the new topic:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ bin/kafka-topics.sh --describe --topic quickstart-events --bootstrap-server localhost:9092
|
|
Topic: quickstart-events TopicId: NPmZHyhbR9y00wMglMH2sg PartitionCount: 1 ReplicationFactor: 1 Configs:
|
|
Topic: quickstart-events Partition: 0 Leader: 0 Replicas: 0 Isr: 0</code></pre>
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_send" href="#quickstart_send"></a>
|
|
<a href="#quickstart_send">Step 4: Write some events into the topic</a>
|
|
</h4>
|
|
|
|
<p>
|
|
A Kafka client communicates with the Kafka brokers via the network for writing (or reading) events.
|
|
Once received, the brokers will store the events in a durable and fault-tolerant manner for as long as you
|
|
need—even forever.
|
|
</p>
|
|
|
|
<p>
|
|
Run the console producer client to write a few events into your topic.
|
|
By default, each line you enter will result in a separate event being written to the topic.
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
|
|
>This is my first event
|
|
>This is my second event</code></pre>
|
|
|
|
<p>You can stop the producer client with <code>Ctrl-C</code> at any time.</p>
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_consume" href="#quickstart_consume"></a>
|
|
<a href="#quickstart_consume">Step 5: Read the events</a>
|
|
</h4>
|
|
|
|
<p>Open another terminal session and run the console consumer client to read the events you just created:</p>
|
|
|
|
<pre><code class="language-bash">$ bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092
|
|
This is my first event
|
|
This is my second event</code></pre>
|
|
|
|
<p>You can stop the consumer client with <code>Ctrl-C</code> at any time.</p>
|
|
|
|
<p>Feel free to experiment: for example, switch back to your producer terminal (previous step) to write
|
|
additional events, and see how the events immediately show up in your consumer terminal.</p>
|
|
|
|
<p>Because events are durably stored in Kafka, they can be read as many times and by as many consumers as you want.
|
|
You can easily verify this by opening yet another terminal session and re-running the previous command again.</p>
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_kafkaconnect" href="#quickstart_kafkaconnect"></a>
|
|
<a href="#quickstart_kafkaconnect">Step 6: Import/export your data as streams of events with Kafka Connect</a>
|
|
</h4>
|
|
|
|
<p>
|
|
You probably have lots of data in existing systems like relational databases or traditional messaging systems,
|
|
along with many applications that already use these systems.
|
|
<a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest
|
|
data from external systems into Kafka, and vice versa. It is an extensible tool that runs
|
|
<i>connectors</i>, which implement the custom logic for interacting with an external system.
|
|
It is thus very easy to integrate existing systems with Kafka. To make this process even easier,
|
|
there are hundreds of such connectors readily available.
|
|
</p>
|
|
|
|
<p>
|
|
In this quickstart we'll see how to run Kafka Connect with simple connectors that import data
|
|
from a file to a Kafka topic and export data from a Kafka topic to a file.
|
|
</p>
|
|
|
|
<p>
|
|
First, make sure to add <code class="language-bash">connect-file-{{fullDotVersion}}.jar</code> to the <code>plugin.path</code> property in the Connect worker's configuration.
|
|
For the purpose of this quickstart we'll use a relative path and consider the connectors' package as an uber jar, which works when the quickstart commands are run from the installation directory.
|
|
However, it's worth noting that for production deployments using absolute paths is always preferable. See <a href="/documentation/#connectconfigs_plugin.path">plugin.path</a> for a detailed description of how to set this config.
|
|
</p>
|
|
|
|
<p>
|
|
Edit the <code class="language-bash">config/connect-standalone.properties</code> file, add or change the <code>plugin.path</code> configuration property match the following, and save the file:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ echo "plugin.path=libs/connect-file-{{fullDotVersion}}.jar" >> config/connect-standalone.properties</code></pre>
|
|
|
|
<p>
|
|
Then, start by creating some seed data to test with:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ echo -e "foo\nbar" > test.txt</code></pre>
|
|
Or on Windows:
|
|
<pre><code class="language-bash">$ echo foo > test.txt
|
|
$ echo bar >> test.txt</code></pre>
|
|
|
|
<p>
|
|
Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated
|
|
process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect
|
|
process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data.
|
|
The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector
|
|
class to instantiate, and any other configuration required by the connector.
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</code></pre>
|
|
|
|
<p>
|
|
These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier
|
|
and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic
|
|
and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file.
|
|
</p>
|
|
|
|
<p>
|
|
During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated.
|
|
Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and
|
|
producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code>
|
|
and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline
|
|
by examining the contents of the output file:
|
|
</p>
|
|
|
|
|
|
<pre><code class="language-bash">$ more test.sink.txt
|
|
foo
|
|
bar</code></pre>
|
|
|
|
<p>
|
|
Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the
|
|
data in the topic (or use custom consumer code to process it):
|
|
</p>
|
|
|
|
|
|
<pre><code class="language-bash">$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
|
|
{"schema":{"type":"string","optional":false},"payload":"foo"}
|
|
{"schema":{"type":"string","optional":false},"payload":"bar"}
|
|
…</code></pre>
|
|
|
|
<p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p>
|
|
|
|
<pre><code class="language-bash">$ echo "Another line" >> test.txt</code></pre>
|
|
|
|
<p>You should see the line appear in the console consumer output and in the sink file.</p>
|
|
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_kafkastreams" href="#quickstart_kafkastreams"></a>
|
|
<a href="#quickstart_kafkastreams">Step 7: Process your events with Kafka Streams</a>
|
|
</h4>
|
|
|
|
<p>
|
|
Once your data is stored in Kafka as events, you can process the data with the
|
|
<a href="/documentation/streams">Kafka Streams</a> client library for Java/Scala.
|
|
It allows you to implement mission-critical real-time applications and microservices, where the input
|
|
and/or output data is stored in Kafka topics. Kafka Streams combines the simplicity of writing and deploying
|
|
standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster
|
|
technology to make these applications highly scalable, elastic, fault-tolerant, and distributed. The library
|
|
supports exactly-once processing, stateful operations and aggregations, windowing, joins, processing based
|
|
on event-time, and much more.
|
|
</p>
|
|
|
|
<p>To give you a first taste, here's how one would implement the popular <code>WordCount</code> algorithm:</p>
|
|
|
|
<pre class="line-numbers"><code class="language-java">KStream<String, String> textLines = builder.stream("quickstart-events");
|
|
|
|
KTable<String, Long> wordCounts = textLines
|
|
.flatMapValues(line -> Arrays.asList(line.toLowerCase().split(" ")))
|
|
.groupBy((keyIgnored, word) -> word)
|
|
.count();
|
|
|
|
wordCounts.toStream().to("output-topic", Produced.with(Serdes.String(), Serdes.Long()));</code></pre>
|
|
|
|
<p>
|
|
The <a href="/documentation/streams/quickstart">Kafka Streams demo</a>
|
|
and the <a href="/{{version}}/documentation/streams/tutorial">app development tutorial</a>
|
|
demonstrate how to code and run such a streaming application from start to finish.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_kafkaterminate" href="#quickstart_kafkaterminate"></a>
|
|
<a href="#quickstart_kafkaterminate">Step 8: Terminate the Kafka environment</a>
|
|
</h4>
|
|
|
|
<p>
|
|
Now that you reached the end of the quickstart, feel free to tear down the Kafka environment—or
|
|
continue playing around.
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
Stop the producer and consumer clients with <code>Ctrl-C</code>, if you haven't done so already.
|
|
</li>
|
|
<li>
|
|
Stop the Kafka broker with <code>Ctrl-C</code>.
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
If you also want to delete any data of your local Kafka environment including any events you have created
|
|
along the way, run the command:
|
|
</p>
|
|
|
|
<pre><code class="language-bash">$ rm -rf /tmp/kafka-logs /tmp/kraft-combined-logs</code></pre>
|
|
|
|
</div>
|
|
|
|
<div class="quickstart-step">
|
|
<h4 class="anchor-heading">
|
|
<a class="anchor-link" id="quickstart_kafkacongrats" href="#quickstart_kafkacongrats"></a>
|
|
<a href="#quickstart_kafkacongrats">Congratulations!</a>
|
|
</h4>
|
|
|
|
<p>You have successfully finished the Apache Kafka quickstart.</p>
|
|
|
|
<p>To learn more, we suggest the following next steps:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
Read through the brief <a href="/intro">Introduction</a>
|
|
to learn how Kafka works at a high level, its main concepts, and how it compares to other
|
|
technologies. To understand Kafka in more detail, head over to the
|
|
<a href="/documentation/">Documentation</a>.
|
|
</li>
|
|
<li>
|
|
Browse through the <a href="/powered-by">Use Cases</a> to learn how
|
|
other users in our world-wide community are getting value out of Kafka.
|
|
</li>
|
|
<!--
|
|
<li>
|
|
Learn how _Kafka compares to other technologies_ you might be familiar with.
|
|
[note to design team: this new page is not yet written]
|
|
</li>
|
|
-->
|
|
<li>
|
|
Join a <a href="/events">local Kafka meetup group</a> and
|
|
<a href="https://kafka-summit.org/past-events/">watch talks from Kafka Summit</a>,
|
|
the main conference of the Kafka community.
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</script>
|
|
|
|
<div class="p-quickstart"></div>
|