Browse Source

KAFKA-3479: Add new consumer metrics documentation

added new consumer metrics section
refactored common metrics into new section
updated TOC

Author: Kaufman Ng <kaufman@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #1361 from coughman/KAFKA-3479-consumer-metrics-doc
pull/1713/head
Kaufman Ng 8 years ago committed by Ewen Cheslack-Postava
parent
commit
6b2564811a
  1. 1
      .gitignore
  2. 5
      docs/documentation.html
  3. 399
      docs/ops.html

1
.gitignore vendored

@ -27,6 +27,7 @@ kafka.iws @@ -27,6 +27,7 @@ kafka.iws
.vagrant
Vagrantfile.local
/logs
.DS_Store
config/server-*
config/zookeeper-*

5
docs/documentation.html

@ -110,6 +110,11 @@ Prior releases: <a href="/07/documentation.html">0.7.x</a>, <a href="/08/documen @@ -110,6 +110,11 @@ Prior releases: <a href="/07/documentation.html">0.7.x</a>, <a href="/08/documen
<li><a href="#ext4">Ext4 Notes</a>
</ul>
<li><a href="#monitoring">6.6 Monitoring</a>
<ul>
<li><a href="#selector_monitoring">Common monitoring metrics for producer/consumer/connect</a></li>
<li><a href="#new_producer_monitoring">New producer monitoring</a></li>
<li><a href="#new_consumer_monitoring">New consumer monitoring</a></li>
</ul>
<li><a href="#zk">6.7 ZooKeeper</a>
<ul>
<li><a href="#zkversion">Stable Version</a>

399
docs/ops.html

@ -689,6 +689,149 @@ We do graphing and alerting on the following metrics: @@ -689,6 +689,149 @@ We do graphing and alerting on the following metrics:
</tr>
</tbody></table>
<h4><a id="selector_monitoring" href="#selector_monitoring">Common monitoring metrics for producer/consumer/connect</a></h4>
The following metrics are available on producer/consumer/connector instances. For specific metrics, please see following sections.
<table class="data-table">
<tbody>
<tr>
<th>Metric/Attribute name</th>
<th>Description</th>
<th>Mbean name</th>
</tr>
<tr>
<td>connection-close-rate</td>
<td>Connections closed per second in the window.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>connection-creation-rate</td>
<td>New connections established per second in the window.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>network-io-rate</td>
<td>The average number of network operations (reads or writes) on all connections per second.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>outgoing-byte-rate</td>
<td>The average number of outgoing bytes sent per second to all servers.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-rate</td>
<td>The average number of requests sent per second.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-avg</td>
<td>The average size of all requests in the window.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-max</td>
<td>The maximum size of any request sent in the window.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>incoming-byte-rate</td>
<td>Bytes/second read off all sockets.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>response-rate</td>
<td>Responses received sent per second.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>select-rate</td>
<td>Number of times the I/O layer checked for new I/O to perform per second.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-wait-time-ns-avg</td>
<td>The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-wait-ratio</td>
<td>The fraction of time the I/O thread spent waiting.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-time-ns-avg</td>
<td>The average length of time for I/O per select call in nanoseconds.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-ratio</td>
<td>The fraction of time the I/O thread spent doing I/O.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>connection-count</td>
<td>The current number of active connections.</td>
<td>kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)</td>
</tr>
</tbody>
</table>
<h4><a id="common_node_monitoring" href="#common_node_monitoring">Common Per-broker metrics for producer/consumer/connect</a></h4>
The following metrics are available on producer/consumer/connector instances. For specific metrics, please see following sections.
<table class="data-table">
<tbody>
<tr>
<th>Metric/Attribute name</th>
<th>Description</th>
<th>Mbean name</th>
</tr>
<tr>
<td>outgoing-byte-rate</td>
<td>The average number of outgoing bytes sent per second for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>request-rate</td>
<td>The average number of requests sent per second for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>request-size-avg</td>
<td>The average size of all requests in the window for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>request-size-max</td>
<td>The maximum size of any request sent in the window for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>incoming-byte-rate</td>
<td>The average number of responses received per second for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>request-latency-avg</td>
<td>The average request latency in ms for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>request-latency-max</td>
<td>The maximum request latency in ms for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
<tr>
<td>response-rate</td>
<td>Responses received sent per second for a node.</td>
<td>kafka.producer:type=[consumer|producer|connect]-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
</tr>
</tbody>
</table>
<h4><a id="new_producer_monitoring" href="#new_producer_monitoring">New producer monitoring</a></h4>
The following metrics are available on new producer instances.
@ -794,157 +937,231 @@ The following metrics are available on new producer instances. @@ -794,157 +937,231 @@ The following metrics are available on new producer instances.
<td>The age in seconds of the current producer metadata being used.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>connection-close-rate</td>
<td>Connections closed per second in the window.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>record-send-rate</td>
<td>The average number of records sent per second for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>connection-creation-rate</td>
<td>New connections established per second in the window.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>byte-rate</td>
<td>The average number of bytes sent per second for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>network-io-rate</td>
<td>The average number of network operations (reads or writes) on all connections per second.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>compression-rate</td>
<td>The average compression rate of record batches for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>outgoing-byte-rate</td>
<td>The average number of outgoing bytes sent per second to all servers.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>record-retry-rate</td>
<td>The average per-second number of retried record sends for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>request-rate</td>
<td>The average number of requests sent per second.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>record-error-rate</td>
<td>The average per-second number of record sends that resulted in errors for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-avg</td>
<td>The average size of all requests in the window.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>produce-throttle-time-max</td>
<td>The maximum time in ms a request was throttled by a broker.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-max</td>
<td>The maximum size of any request sent in the window.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>produce-throttle-time-avg</td>
<td>The average time in ms a request was throttled by a broker.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td>
</tr>
</tbody></table>
<h4><a id="new_consumer_monitoring" href="#new_consumer_monitoring">New consumer monitoring</a></h4>
The following metrics are available on new consumer instances.
<h5><a id="new_consumer_group_monitoring" href="#new_consumer_group_monitoring">Consumer Group Metrics</a></h5>
<table class="data-table">
<tbody>
<tr>
<td>incoming-byte-rate</td>
<td>Bytes/second read off all sockets.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<th>Metric/Attribute name</th>
<th>Description</th>
<th>Mbean name</th>
</tr>
<tr>
<td>response-rate</td>
<td>Responses received sent per second.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>commit-latency-avg</td>
<td>The average time taken for a commit request</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>select-rate</td>
<td>Number of times the I/O layer checked for new I/O to perform per second.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>commit-latency-max</td>
<td>The max time taken for a commit request</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-wait-time-ns-avg</td>
<td>The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>commit-rate</td>
<td>The number of commit calls per second</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-wait-ratio</td>
<td>The fraction of time the I/O thread spent waiting.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>assigned-partitions</td>
<td>The number of partitions currently assigned to this consumer</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-time-ns-avg</td>
<td>The average length of time for I/O per select call in nanoseconds.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>heartbeat-response-time-max</td>
<td>The max time taken to receive a response to a heartbeat request</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>io-ratio</td>
<td>The fraction of time the I/O thread spent doing I/O.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>heartbeat-rate</td>
<td>The average number of heartbeats per second</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>connection-count</td>
<td>The current number of active connections.</td>
<td>kafka.producer:type=producer-metrics,client-id=([-.\w]+)</td>
<td>join-time-avg</td>
<td>The average time taken for a group rejoin</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>outgoing-byte-rate</td>
<td>The average number of outgoing bytes sent per second for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>join-time-max</td>
<td>The max time taken for a group rejoin</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-rate</td>
<td>The average number of requests sent per second for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>join-rate</td>
<td>The number of group joins per second</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-avg</td>
<td>The average size of all requests in the window for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>sync-time-avg</td>
<td>The average time taken for a group sync</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-size-max</td>
<td>The maximum size of any request sent in the window for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>sync-time-max</td>
<td>The max time taken for a group sync</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>incoming-byte-rate</td>
<td>The average number of responses received per second for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>sync-rate</td>
<td>The number of group syncs per second</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>request-latency-avg</td>
<td>The average request latency in ms for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>last-heartbeat-seconds-ago</td>
<td>The number of seconds since the last controller heartbeat</td>
<td>kafka.consumer:type=consumer-coordinator-metrics,client-id=([-.\w]+)</td>
</tr>
</tbody>
</table>
<h5><a id="new_consumer_fetch_monitoring" href="#new_consumer_fetch_monitoring">Consumer Fetch Metrics</a></h5>
<table class="data-table">
<tbody>
<tr>
<td>request-latency-max</td>
<td>The maximum request latency in ms for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<th>Metric/Attribute name</th>
<th>Description</th>
<th>Mbean name</th>
</tr>
<tr>
<td>response-rate</td>
<td>Responses received sent per second for a node.</td>
<td>kafka.producer:type=producer-node-metrics,client-id=([-.\w]+),node-id=([0-9]+)</td>
<td>fetch-size-avg</td>
<td>The average number of bytes fetched per request</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>record-send-rate</td>
<td>The average number of records sent per second for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
<td>fetch-size-max</td>
<td>The maximum number of bytes fetched per request</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>byte-rate</td>
<td>The average number of bytes sent per second for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
<td>bytes-consumed-rate</td>
<td>The average number of bytes consumed per second</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>compression-rate</td>
<td>The average compression rate of record batches for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
<td>records-per-request-avg</td>
<td>The average number of records in each request</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>record-retry-rate</td>
<td>The average per-second number of retried record sends for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
<td>records-consumed-rate</td>
<td>The average number of records consumed per second</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>record-error-rate</td>
<td>The average per-second number of record sends that resulted in errors for a topic.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
<td>fetch-latency-avg</td>
<td>The average time taken for a fetch request</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>produce-throttle-time-max</td>
<td>The maximum time in ms a request was throttled by a broker.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td>
<td>fetch-latency-max</td>
<td>The max time taken for a fetch request</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>produce-throttle-time-avg</td>
<td>The average time in ms a request was throttled by a broker.</td>
<td>kafka.producer:type=producer-topic-metrics,client-id=([-.\w]+)</td>
<td>fetch-rate</td>
<td>The number of fetch requests per second</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
</tbody></table>
<tr>
<td>records-lag-max</td>
<td>The maximum lag in terms of number of records for any partition in this window</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>fetch-throttle-time-avg</td>
<td>The average throttle time in ms</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>fetch-throttle-time-max</td>
<td>The maximum throttle time in ms</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+)</td>
</tr>
</tbody>
</table>
<h5><a id="topic_fetch_monitoring" href="#topic_fetch_monitoring">Topic-level Fetch Metrics</a></h5>
<table class="data-table">
<tbody>
<tr>
<th>Metric/Attribute name</th>
<th>Description</th>
<th>Mbean name</th>
</tr>
<tr>
<td>fetch-size-avg</td>
<td>The average number of bytes fetched per request for a specific topic.</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>fetch-size-max</td>
<td>The maximum number of bytes fetched per request for a specific topic.</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>bytes-consumed-rate</td>
<td>The average number of bytes consumed per second for a specific topic.</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>records-per-request-avg</td>
<td>The average number of records in each request for a specific topic.</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
<tr>
<td>records-consumed-rate</td>
<td>The average number of records consumed per second for a specific topic.</td>
<td>kafka.consumer:type=consumer-fetch-manager-metrics,client-id=([-.\w]+),topic=([-.\w]+)</td>
</tr>
</tbody>
</table>
<h5><a id="others_monitoring" href="#others_monitoring">Others</a></h5>
We recommend monitoring GC time and other stats and various server stats such as CPU utilization, I/O service time, etc.

Loading…
Cancel
Save