You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
388 lines
18 KiB
388 lines
18 KiB
<!-- |
|
Licensed to the Apache Software Foundation (ASF) under one or more |
|
contributor license agreements. See the NOTICE file distributed with |
|
this work for additional information regarding copyright ownership. |
|
The ASF licenses this file to You under the Apache License, Version 2.0 |
|
(the "License"); you may not use this file except in compliance with |
|
the License. You may obtain a copy of the License at |
|
|
|
http://www.apache.org/licenses/LICENSE-2.0 |
|
|
|
Unless required by applicable law or agreed to in writing, software |
|
distributed under the License is distributed on an "AS IS" BASIS, |
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
|
See the License for the specific language governing permissions and |
|
limitations under the License. |
|
--> |
|
|
|
<script> |
|
<!--#include virtual="js/templateData.js" --> |
|
</script> |
|
|
|
<script id="quickstart-template" type="text/x-handlebars-template"> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_download" href="#quickstart_download"></a> |
|
<a href="#quickstart_download">Step 1: Get Kafka</a> |
|
</h4> |
|
|
|
<p> |
|
<a href="https://www.apache.org/dyn/closer.cgi?path=/kafka/{{fullDotVersion}}/kafka_{{scalaVersion}}-{{fullDotVersion}}.tgz">Download</a> |
|
the latest Kafka release and extract it: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ tar -xzf kafka_{{scalaVersion}}-{{fullDotVersion}}.tgz |
|
$ cd kafka_{{scalaVersion}}-{{fullDotVersion}}</code></pre> |
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_startserver" href="#quickstart_startserver"></a> |
|
<a href="#quickstart_startserver">Step 2: Start the Kafka environment</a> |
|
</h4> |
|
|
|
<p class="note"> |
|
NOTE: Your local environment must have Java 8+ installed. |
|
</p> |
|
|
|
<p> |
|
Apache Kafka can be started using ZooKeeper or KRaft. To get started with either configuration follow one the sections below but not both. |
|
</p> |
|
|
|
<h5> |
|
Kafka with ZooKeeper |
|
</h5> |
|
|
|
<p> |
|
Run the following commands in order to start all services in the correct order: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash"># Start the ZooKeeper service |
|
$ bin/zookeeper-server-start.sh config/zookeeper.properties</code></pre> |
|
|
|
<p> |
|
Open another terminal session and run: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash"># Start the Kafka broker service |
|
$ bin/kafka-server-start.sh config/server.properties</code></pre> |
|
|
|
<p> |
|
Once all services have successfully launched, you will have a basic Kafka environment running and ready to use. |
|
</p> |
|
|
|
<h5> |
|
Kafka with KRaft |
|
</h5> |
|
|
|
<p> |
|
Generate a Cluster UUID |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ KAFKA_CLUSTER_ID="$(bin/kafka-storage.sh random-uuid)"</code></pre> |
|
|
|
<p> |
|
Format Log Directories |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-storage.sh format -t $KAFKA_CLUSTER_ID -c config/kraft/server.properties</code></pre> |
|
|
|
<p> |
|
Start the Kafka Server |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-server-start.sh config/kraft/server.properties</code></pre> |
|
|
|
<p> |
|
Once the Kafka server has successfully launched, you will have a basic Kafka environment running and ready to use. |
|
</p> |
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_createtopic" href="#quickstart_createtopic"></a> |
|
<a href="#quickstart_createtopic">Step 3: Create a topic to store your events</a> |
|
</h4> |
|
|
|
<p> |
|
Kafka is a distributed <em>event streaming platform</em> that lets you read, write, store, and process |
|
<a href="/documentation/#messages"><em>events</em></a> (also called <em>records</em> or |
|
<em>messages</em> in the documentation) |
|
across many machines. |
|
</p> |
|
|
|
<p> |
|
Example events are payment transactions, geolocation updates from mobile phones, shipping orders, sensor measurements |
|
from IoT devices or medical equipment, and much more. These events are organized and stored in |
|
<a href="/documentation/#intro_concepts_and_terms"><em>topics</em></a>. |
|
Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder. |
|
</p> |
|
|
|
<p> |
|
So before you can write your first events, you must create a topic. Open another terminal session and run: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092</code></pre> |
|
|
|
<p> |
|
All of Kafka's command line tools have additional options: run the <code>kafka-topics.sh</code> command without any |
|
arguments to display usage information. For example, it can also show you |
|
<a href="/documentation/#intro_concepts_and_terms">details such as the partition count</a> |
|
of the new topic: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-topics.sh --describe --topic quickstart-events --bootstrap-server localhost:9092 |
|
Topic: quickstart-events TopicId: NPmZHyhbR9y00wMglMH2sg PartitionCount: 1 ReplicationFactor: 1 Configs: |
|
Topic: quickstart-events Partition: 0 Leader: 0 Replicas: 0 Isr: 0</code></pre> |
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_send" href="#quickstart_send"></a> |
|
<a href="#quickstart_send">Step 4: Write some events into the topic</a> |
|
</h4> |
|
|
|
<p> |
|
A Kafka client communicates with the Kafka brokers via the network for writing (or reading) events. |
|
Once received, the brokers will store the events in a durable and fault-tolerant manner for as long as you |
|
need—even forever. |
|
</p> |
|
|
|
<p> |
|
Run the console producer client to write a few events into your topic. |
|
By default, each line you enter will result in a separate event being written to the topic. |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092 |
|
This is my first event |
|
This is my second event</code></pre> |
|
|
|
<p> |
|
You can stop the producer client with <code>Ctrl-C</code> at any time. |
|
</p> |
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_consume" href="#quickstart_consume"></a> |
|
<a href="#quickstart_consume">Step 5: Read the events</a> |
|
</h4> |
|
|
|
<p>Open another terminal session and run the console consumer client to read the events you just created:</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092 |
|
This is my first event |
|
This is my second event</code></pre> |
|
|
|
<p>You can stop the consumer client with <code>Ctrl-C</code> at any time.</p> |
|
|
|
<p>Feel free to experiment: for example, switch back to your producer terminal (previous step) to write |
|
additional events, and see how the events immediately show up in your consumer terminal.</p> |
|
|
|
<p>Because events are durably stored in Kafka, they can be read as many times and by as many consumers as you want. |
|
You can easily verify this by opening yet another terminal session and re-running the previous command again.</p> |
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_kafkaconnect" href="#quickstart_kafkaconnect"></a> |
|
<a href="#quickstart_kafkaconnect">Step 6: Import/export your data as streams of events with Kafka Connect</a> |
|
</h4> |
|
|
|
<p> |
|
You probably have lots of data in existing systems like relational databases or traditional messaging systems, |
|
along with many applications that already use these systems. |
|
<a href="/documentation/#connect">Kafka Connect</a> allows you to continuously ingest |
|
data from external systems into Kafka, and vice versa. It is an extensible tool that runs |
|
<i>connectors</i>, which implement the custom logic for interacting with an external system. |
|
It is thus very easy to integrate existing systems with Kafka. To make this process even easier, |
|
there are hundreds of such connectors readily available. |
|
</p> |
|
|
|
<p> |
|
In this quickstart we'll see how to run Kafka Connect with simple connectors that import data |
|
from a file to a Kafka topic and export data from a Kafka topic to a file. |
|
</p> |
|
|
|
<p> |
|
First, make sure to add <code class="language-bash">connect-file-{{fullDotVersion}}.jar</code> to the <code>plugin.path</code> property in the Connect worker's configuration. |
|
For the purpose of this quickstart we'll use a relative path and consider the connectors' package as an uber jar, which works when the quickstart commands are run from the installation directory. |
|
However, it's worth noting that for production deployments using absolute paths is always preferable. See <a href="/documentation/#connectconfigs_plugin.path">plugin.path</a> for a detailed description of how to set this config. |
|
</p> |
|
|
|
<p> |
|
Edit the <code class="language-bash">config/connect-standalone.properties</code> file, add or change the <code>plugin.path</code> configuration property match the following, and save the file: |
|
</p> |
|
|
|
<pre class="brush: bash;"> |
|
> echo "plugin.path=libs/connect-file-{{fullDotVersion}}.jar"</pre> |
|
|
|
<p> |
|
Then, start by creating some seed data to test with: |
|
</p> |
|
|
|
<pre class="brush: bash;"> |
|
> echo -e "foo\nbar" > test.txt</pre> |
|
Or on Windows: |
|
<pre class="brush: bash;"> |
|
> echo foo> test.txt |
|
> echo bar>> test.txt</pre> |
|
|
|
<p> |
|
Next, we'll start two connectors running in <i>standalone</i> mode, which means they run in a single, local, dedicated |
|
process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect |
|
process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data. |
|
The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector |
|
class to instantiate, and any other configuration required by the connector. |
|
</p> |
|
|
|
<pre class="brush: bash;"> |
|
> bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties</pre> |
|
|
|
<p> |
|
These sample configuration files, included with Kafka, use the default local cluster configuration you started earlier |
|
and create two connectors: the first is a source connector that reads lines from an input file and produces each to a Kafka topic |
|
and the second is a sink connector that reads messages from a Kafka topic and produces each as a line in an output file. |
|
</p> |
|
|
|
<p> |
|
During startup you'll see a number of log messages, including some indicating that the connectors are being instantiated. |
|
Once the Kafka Connect process has started, the source connector should start reading lines from <code>test.txt</code> and |
|
producing them to the topic <code>connect-test</code>, and the sink connector should start reading messages from the topic <code>connect-test</code> |
|
and write them to the file <code>test.sink.txt</code>. We can verify the data has been delivered through the entire pipeline |
|
by examining the contents of the output file: |
|
</p> |
|
|
|
|
|
<pre class="brush: bash;"> |
|
> more test.sink.txt |
|
foo |
|
bar</pre> |
|
|
|
<p> |
|
Note that the data is being stored in the Kafka topic <code>connect-test</code>, so we can also run a console consumer to see the |
|
data in the topic (or use custom consumer code to process it): |
|
</p> |
|
|
|
|
|
<pre class="brush: bash;"> |
|
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning |
|
{"schema":{"type":"string","optional":false},"payload":"foo"} |
|
{"schema":{"type":"string","optional":false},"payload":"bar"} |
|
...</pre> |
|
|
|
<p>The connectors continue to process data, so we can add data to the file and see it move through the pipeline:</p> |
|
|
|
<pre class="brush: bash;"> |
|
> echo Another line>> test.txt</pre> |
|
|
|
<p>You should see the line appear in the console consumer output and in the sink file.</p> |
|
|
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_kafkastreams" href="#quickstart_kafkastreams"></a> |
|
<a href="#quickstart_kafkastreams">Step 7: Process your events with Kafka Streams</a> |
|
</h4> |
|
|
|
<p> |
|
Once your data is stored in Kafka as events, you can process the data with the |
|
<a href="/documentation/streams">Kafka Streams</a> client library for Java/Scala. |
|
It allows you to implement mission-critical real-time applications and microservices, where the input |
|
and/or output data is stored in Kafka topics. Kafka Streams combines the simplicity of writing and deploying |
|
standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster |
|
technology to make these applications highly scalable, elastic, fault-tolerant, and distributed. The library |
|
supports exactly-once processing, stateful operations and aggregations, windowing, joins, processing based |
|
on event-time, and much more. |
|
</p> |
|
|
|
<p>To give you a first taste, here's how one would implement the popular <code>WordCount</code> algorithm:</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">KStream<String, String> textLines = builder.stream("quickstart-events"); |
|
|
|
KTable<String, Long> wordCounts = textLines |
|
.flatMapValues(line -> Arrays.asList(line.toLowerCase().split(" "))) |
|
.groupBy((keyIgnored, word) -> word) |
|
.count(); |
|
|
|
wordCounts.toStream().to("output-topic", Produced.with(Serdes.String(), Serdes.Long()));</code></pre> |
|
|
|
<p> |
|
The <a href="/documentation/streams/quickstart">Kafka Streams demo</a> |
|
and the <a href="/{{version}}/documentation/streams/tutorial">app development tutorial</a> |
|
demonstrate how to code and run such a streaming application from start to finish. |
|
</p> |
|
|
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_kafkaterminate" href="#quickstart_kafkaterminate"></a> |
|
<a href="#quickstart_kafkaterminate">Step 8: Terminate the Kafka environment</a> |
|
</h4> |
|
|
|
<p> |
|
Now that you reached the end of the quickstart, feel free to tear down the Kafka environment—or |
|
continue playing around. |
|
</p> |
|
|
|
<ol> |
|
<li> |
|
Stop the producer and consumer clients with <code>Ctrl-C</code>, if you haven't done so already. |
|
</li> |
|
<li> |
|
Stop the Kafka broker with <code>Ctrl-C</code>. |
|
</li> |
|
<li> |
|
Lastly, if the Kafka with ZooKeeper section was followed, stop the ZooKeeper server with <code>Ctrl-C</code>. |
|
</li> |
|
</ol> |
|
|
|
<p> |
|
If you also want to delete any data of your local Kafka environment including any events you have created |
|
along the way, run the command: |
|
</p> |
|
|
|
<pre class="line-numbers"><code class="language-bash">$ rm -rf /tmp/kafka-logs /tmp/zookeeper /tmp/kraft-combined-logs</code></pre> |
|
|
|
</div> |
|
|
|
<div class="quickstart-step"> |
|
<h4 class="anchor-heading"> |
|
<a class="anchor-link" id="quickstart_kafkacongrats" href="#quickstart_kafkacongrats"></a> |
|
<a href="#quickstart_kafkacongrats">Congratulations!</a> |
|
</h4> |
|
|
|
<p>You have successfully finished the Apache Kafka quickstart.<div> |
|
|
|
<p>To learn more, we suggest the following next steps:</p> |
|
|
|
<ul> |
|
<li> |
|
Read through the brief <a href="/intro">Introduction</a> |
|
to learn how Kafka works at a high level, its main concepts, and how it compares to other |
|
technologies. To understand Kafka in more detail, head over to the |
|
<a href="/documentation/">Documentation</a>. |
|
</li> |
|
<li> |
|
Browse through the <a href="/powered-by">Use Cases</a> to learn how |
|
other users in our world-wide community are getting value out of Kafka. |
|
</li> |
|
<!-- |
|
<li> |
|
Learn how _Kafka compares to other technologies_ you might be familiar with. |
|
[note to design team: this new page is not yet written] |
|
</li> |
|
--> |
|
<li> |
|
Join a <a href="/events">local Kafka meetup group</a> and |
|
<a href="https://kafka-summit.org/past-events/">watch talks from Kafka Summit</a>, |
|
the main conference of the Kafka community. |
|
</li> |
|
</ul> |
|
</div> |
|
</script> |
|
|
|
<div class="p-quickstart"></div>
|
|
|