src-kafka/docs/uses.html

<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<p> Here is a description of a few of the popular use cases for Apache Kafka. For an overview of a number of these areas in action, see <a href="http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">this blog post</a>. </p>

<h4><a id="uses_messaging" href="#uses_messaging">Messaging</a></h4>

Kafka works well as a replacement for a more traditional message broker. Message brokers are used for a variety of reasons (to decouple processing from data producers, to buffer unprocessed messages, etc). In comparison to most messaging systems Kafka has better throughput, built-in partitioning, replication, and fault-tolerance which makes it a good solution for large scale message processing applications.
<p>
In our experience messaging uses are often comparatively low-throughput, but may require low end-to-end latency and often depend on the strong durability guarantees Kafka provides.
<p>
In this domain Kafka is comparable to traditional messaging systems such as <a href="http://activemq.apache.org">ActiveMQ</a> or <a href="https://www.rabbitmq.com">RabbitMQ</a>.

<h4><a id="uses_website" href="#uses_website">Website Activity Tracking</a></h4>

The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds. This means site activity (page views, searches, or other actions users may take) is published to central topics with one topic per activity type. These feeds are available for subscription for a range of use cases including real-time processing, real-time monitoring, and loading into Hadoop or offline data warehousing systems for offline processing and reporting.
<p>
Activity tracking is often very high volume as many activity messages are generated for each user page view.

<h4><a id="uses_metrics" href="#uses_metrics">Metrics</a></h4>

Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data.

<h4><a id="uses_logs" href="#uses_logs">Log Aggregation</a></h4>

Many people use Kafka as a replacement for a log aggregation solution. Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing. Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This allows for lower-latency processing and easier support for multiple data sources and distributed data consumption.

In comparison to log-centric systems like Scribe or Flume, Kafka offers equally good performance, stronger durability guarantees due to replication, and much lower end-to-end latency.

<h4><a id="uses_streamprocessing" href="#uses_streamprocessing">Stream Processing</a></h4>

Many users of Kafka process data in processing pipelines consisting of multiple stages, where raw input data is consumed from Kafka topics and then aggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing. For example, a processing pipeline for recommending news articles might crawl article content from RSS feeds and publish it to an "articles" topic; further processing might normalize or deduplicate this content and published the cleansed article content to a new topic; a final processing stage might attempt to recommend this content to users. Such processing pipelines create graphs of real-time data flows based on the individual topics. Starting in 0.10.0.0, a light-weight but powerful stream processing library called <a href="#streams_overview">Kafka Streams</a> is available in Apache Kafka to perform such data processing as described above. Apart from Kafka Streams, alternative open source stream processing tools include <a href="https://storm.apache.org/">Apache Storm</a> and <a href="http://samza.apache.org/">Apache Samza</a>.

<h4><a id="uses_eventsourcing" href="#uses_eventsourcing">Event Sourcing</a></h4>

<a href="http://martinfowler.com/eaaDev/EventSourcing.html">Event sourcing</a> is a style of application design where state changes are logged as a time-ordered sequence of records. Kafka's support for very large stored log data makes it an excellent backend for an application built in this style.

<h4><a id="uses_commitlog" href="#uses_commitlog">Commit Log</a></h4>

Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data. The <a href="/documentation.html#compaction">log compaction</a> feature in Kafka helps support this usage. In this usage Kafka is similar to <a href="http://zookeeper.apache.org/bookkeeper/">Apache BookKeeper</a> project.
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago			`<!--`
			`Licensed to the Apache Software Foundation (ASF) under one or more`
			`contributor license agreements. See the NOTICE file distributed with`
			`this work for additional information regarding copyright ownership.`
			`The ASF licenses this file to You under the Apache License, Version 2.0`
			`(the "License"); you may not use this file except in compliance with`
			`the License. You may obtain a copy of the License at`
KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago			`http://www.apache.org/licenses/LICENSE-2.0`
KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago			`Unless required by applicable law or agreed to in writing, software`
			`distributed under the License is distributed on an "AS IS" BASIS,`
			`WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`See the License for the specific language governing permissions and`
			`limitations under the License.`
			`-->`

KAFKA-4244; Fix formatting issues in documentation Author: Gwen Shapira <cshapi@gmail.com> Reviewers: Jason Gustafson <jason@confluent.io> Closes #1966 from gwenshap/KAFKA-4244 8 years ago			`<p> Here is a description of a few of the popular use cases for Apache Kafka. For an overview of a number of these areas in action, see <a href="http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">this blog post</a>. </p>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_messaging" href="#uses_messaging">Messaging</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			`Kafka works well as a replacement for a more traditional message broker. Message brokers are used for a variety of reasons (to decouple processing from data producers, to buffer unprocessed messages, etc). In comparison to most messaging systems Kafka has better throughput, built-in partitioning, replication, and fault-tolerance which makes it a good solution for large scale message processing applications.`
			`<p>`
			`In our experience messaging uses are often comparatively low-throughput, but may require low end-to-end latency and often depend on the strong durability guarantees Kafka provides.`
			`<p>`
			`In this domain Kafka is comparable to traditional messaging systems such as <a href="http://activemq.apache.org">ActiveMQ</a> or <a href="https://www.rabbitmq.com">RabbitMQ</a>.`

KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_website" href="#uses_website">Website Activity Tracking</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds. This means site activity (page views, searches, or other actions users may take) is published to central topics with one topic per activity type. These feeds are available for subscription for a range of use cases including real-time processing, real-time monitoring, and loading into Hadoop or offline data warehousing systems for offline processing and reporting.
			`<p>`
			`Activity tracking is often very high volume as many activity messages are generated for each user page view.`

KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_metrics" href="#uses_metrics">Metrics</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			`Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data.`

KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_logs" href="#uses_logs">Log Aggregation</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			`Many people use Kafka as a replacement for a log aggregation solution. Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing. Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This allows for lower-latency processing and easier support for multiple data sources and distributed data consumption.`

			`In comparison to log-centric systems like Scribe or Flume, Kafka offers equally good performance, stronger durability guarantees due to replication, and much lower end-to-end latency.`

KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_streamprocessing" href="#uses_streamprocessing">Stream Processing</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
MINOR: Add Kafka Streams API / upgrade notes Author: Guozhang Wang <wangguoz@gmail.com> Reviewers: Michael G. Noll <michael@confluent.io>, Ismael Juma <ismael@juma.me.uk> Closes #1321 from guozhangwang/KStreamsJavaDoc 9 years ago			Many users of Kafka process data in processing pipelines consisting of multiple stages, where raw input data is consumed from Kafka topics and then aggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing. For example, a processing pipeline for recommending news articles might crawl article content from RSS feeds and publish it to an "articles" topic; further processing might normalize or deduplicate this content and published the cleansed article content to a new topic; a final processing stage might attempt to recommend this content to users. Such processing pipelines create graphs of real-time data flows based on the individual topics. Starting in 0.10.0.0, a light-weight but powerful stream processing library called <a href="#streams_overview">Kafka Streams</a> is available in Apache Kafka to perform such data processing as described above. Apart from Kafka Streams, alternative open source stream processing tools include <a href="https://storm.apache.org/">Apache Storm</a> and <a href="http://samza.apache.org/">Apache Samza</a>.
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_eventsourcing" href="#uses_eventsourcing">Event Sourcing</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			`<a href="http://martinfowler.com/eaaDev/EventSourcing.html">Event sourcing</a> is a style of application design where state changes are logged as a time-ordered sequence of records. Kafka's support for very large stored log data makes it an excellent backend for an application built in this style.`

KAFKA-2809; Improve documentation linking Often it is useful to link to a specific header within the documentation. Especially when referencing docs in the mailing lists. This adds anchors and links for all headers in the docs. Author: Grant Henke <granthenke@gmail.com> Reviewers: Jun Rao <junrao@gmail.com> Closes #498 from granthenke/doc-links 9 years ago			`<h4><a id="uses_commitlog" href="#uses_commitlog">Commit Log</a></h4>`
KAFKA-2425; Copy latest docs to kafka repo docs/ directory This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository. Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here? Author: Manikumar reddy O <manikumar.reddy@gmail.com> Reviewers: Gwen Shapira, Ismael Juma Closes #171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO 9 years ago
			`Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data. The <a href="/documentation.html#compaction">log compaction</a> feature in Kafka helps support this usage. In this usage Kafka is similar to <a href="http://zookeeper.apache.org/bookkeeper/">Apache BookKeeper</a> project.`