Browse Source

KAFKA-4079; Documentation for secure quotas

Details in KIP-55.

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #1847 from rajinisivaram/KAFKA-4079
pull/1847/merge
Rajini Sivaram 8 years ago committed by Jun Rao
parent
commit
f27a6f319a
  1. 39
      docs/design.html
  2. 87
      docs/ops.html

39
docs/design.html

@ -354,15 +354,44 @@ Further cleaner configurations are described <a href="/documentation.html#broker @@ -354,15 +354,44 @@ Further cleaner configurations are described <a href="/documentation.html#broker
<h3><a id="design_quotas" href="#design_quotas">4.9 Quotas</a></h3>
<p>
Starting in 0.9, the Kafka cluster has the ability to enforce quotas on produce and fetch requests. Quotas are basically byte-rate thresholds defined per client-id. A client-id logically identifies an application making a request. Hence a single client-id can span multiple producer and consumer instances and the quota will apply for all of them as a single entity i.e. if client-id="test-client" has a produce quota of 10MB/sec, this is shared across all instances with that same id.
Starting in 0.9, the Kafka cluster has the ability to enforce quotas on produce and fetch requests. Quotas are basically byte-rate thresholds defined per group of clients sharing a quota.
</p>
<h4><a id="design_quotasnecessary" href="#design_quotasnecessary">Why are quotas necessary?</a></h4>
<p>
It is possible for producers and consumers to produce/consume very high volumes of data and thus monopolize broker resources, cause network saturation and generally DOS other clients and the brokers themselves. Having quotas protects against these issues and is all the more important in large multi-tenant clusters where a small set of badly behaved clients can degrade user experience for the well behaved ones. In fact, when running Kafka as a service this even makes it possible to enforce API limits according to an agreed upon contract.
</p>
<h4><a id="design_quotasgroups" href="#design_quotasgroups">Client groups</a></h4>
The identity of Kafka clients is the user principal which represents an authenticated user in a secure cluster. In a cluster that supports unauthenticated clients, user principal is a grouping of unauthenticated users
chosen by the broker using a configurable <code>PrincipalBuilder</code>. Client-id is a logical grouping of clients with a meaningful name chosen by the client application. The tuple (user, client-id) defines a secure logical group of clients that share both user principal and client-id.
<p>
Quotas can be applied to (user, client-id), user or client-id groups. For a given connection, the most specific quota matching the connection is applied. All connections of a quota group share the quota configured for the group.
For example, if (user="test-user", client-id="test-client") has a produce quota of 10MB/sec, this is shared across all producer instances of user "test-user" with the client-id "test-client".
</p>
<h4><a id="design_quotasconfig" href="#design_quotasconfig">Quota Configuration</a></h4>
<p>
Quota configuration may be defined for (user, client-id), user and client-id groups. It is possible to override the default quota at any of the quota levels that needs a higher (or even lower) quota. The mechanism is similar to the per-topic log config overrides.
User and (user, client-id) quota overrides are written to ZooKeeper under <i><b>/config/users</b></i> and client-id quota overrides are written under <i><b>/config/clients</b></i>. These overrides are read by all brokers and are effective immediately. This lets us change quotas without having to do a rolling restart of the entire cluster. See <a href="#quotas">here</a> for details.
Default quotas for each group may also be updated dynamically using the same mechanism.
</p>
<p>
The order of precedence for quota configuration is:
<ol>
<li>/config/users/&lt;user&gt;/clients/&lt;client-id&gt;</li>
<li>/config/users/&lt;user&gt;/clients/&lt;default&gt;</li>
<li>/config/users/&lt;user&gt;</li>
<li>/config/users/&lt;default&gt;/clients/&lt;client-id&gt;</li>
<li>/config/users/&lt;default&gt;/clients/&lt;default&gt;</li>
<li>/config/users/&lt;default&gt;</li>
<li>/config/clients/&lt;client-id&gt;</li>
<li>/config/clients/&lt;default&gt;</li>
</ol>
Broker properties (quota.producer.default, quota.consumer.default) can also be used to set defaults for client-id groups. These properties are being deprecated and will be removed in a later release. Default quotas for client-id can be set in Zookeeper similar to the other quota overrides and defaults.
</p>
<h4><a id="design_quotasenforcement" href="#design_quotasenforcement">Enforcement</a></h4>
<p>
By default, each unique client-id receives a fixed quota in bytes/sec as configured by the cluster (quota.producer.default, quota.consumer.default).
By default, each unique client group receives a fixed quota in bytes/sec as configured by the cluster.
This quota is defined on a per-broker basis. Each client can publish/fetch a maximum of X bytes/sec per broker before it gets throttled. We decided that defining these quotas per broker is much better than having a fixed cluster wide bandwidth per client because that would require a mechanism to share client quota usage among all the brokers. This can be harder to get right than the quota implementation itself!
</p>
<p>
@ -371,9 +400,3 @@ It is possible for producers and consumers to produce/consume very high volumes @@ -371,9 +400,3 @@ It is possible for producers and consumers to produce/consume very high volumes
<p>
Client byte rate is measured over multiple small windows (e.g. 30 windows of 1 second each) in order to detect and correct quota violations quickly. Typically, having large measurement windows (for e.g. 10 windows of 30 seconds each) leads to large bursts of traffic followed by long delays which is not great in terms of user experience.
</p>
<h4><a id="design_quotasoverrides" href="#design_quotasoverrides">Quota overrides</a></h4>
<p>
It is possible to override the default quota for client-ids that need a higher (or even lower) quota. The mechanism is similar to the per-topic log config overrides.
Client-id overrides are written to ZooKeeper under <i><b>/config/clients</b></i>. These overrides are read by all brokers and are effective immediately. This lets us change quotas without having to do a rolling restart of the entire cluster. See <a href="#quotas">here</a> for details.
</p>

87
docs/ops.html

@ -340,23 +340,83 @@ Topic:foo PartitionCount:1 ReplicationFactor:3 Configs: @@ -340,23 +340,83 @@ Topic:foo PartitionCount:1 ReplicationFactor:3 Configs:
</pre>
<h4><a id="quotas" href="#quotas">Setting quotas</a></h4>
It is possible to set default quotas that apply to all client-ids by setting these configs on the brokers. By default, each client-id receives an unlimited quota. The following sets the default quota per producer and consumer client-id to 10MB/sec.
Quotas overrides and defaults may be configured at (user, client-id), user or client-id levels as described <a href="#design_quotas">here</a>.
By default, clients receive an unlimited quota.
It is possible to set custom quotas for each (user, client-id), user or client-id group.
<p>
Configure custom quota for (user=user1, client-id=clientA):
<pre>
quota.producer.default=10485760
quota.consumer.default=10485760
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type users --entity-name user1 --entity-type clients --entity-name clientA
Updated config for entity: user-principal 'user1', client-id 'clientA'.
</pre>
Configure custom quota for user=user1:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type users --entity-name user1
Updated config for entity: user-principal 'user1'.
</pre>
Configure custom quota for client-id=clientA:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type clients --entity-name clientA
Updated config for entity: client-id 'clientA'.
</pre>
It is possible to set default quotas for each (user, client-id), user or client-id group by specifying <i>--entity-default</i> option instead of <i>--entity-name</i>.
<p>
Configure default client-id quota for user=userA:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type users --entity-name user1 --entity-type clients --entity-default
Updated config for entity: user-principal 'user1', default client-id.
</pre>
Configure default quota for user:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type users --entity-default
Updated config for entity: default user-principal.
</pre>
It is also possible to set custom quotas for each client.
Configure default quota for client-id:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-name clientA --entity-type clients
Updated config for clientId: "clientA".
> bin/kafka-configs.sh --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=1024,consumer_byte_rate=2048' --entity-type clients --entity-default
Updated config for entity: default client-id.
</pre>
Here's how to describe the quota for a given client.
Here's how to describe the quota for a given (user, client-id):
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type users --entity-name user1 --entity-type clients --entity-name clientA
Configs for user-principal 'user1', client-id 'clientA' are producer_byte_rate=1024,consumer_byte_rate=2048
</pre>
Describe quota for a given user:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type users --entity-name user1
Configs for user-principal 'user1' are producer_byte_rate=1024,consumer_byte_rate=2048
</pre>
Describe quota for a given client-id:
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type clients --entity-name clientA
Configs for client-id 'clientA' are producer_byte_rate=1024,consumer_byte_rate=2048
</pre>
If entity name is not specified, all entities of the specified type are described. For example, describe all users:
<pre>
> ./kafka-configs.sh --zookeeper localhost:2181 --describe --entity-name clientA --entity-type clients
Configs for clients:clientA are producer_byte_rate=1024,consumer_byte_rate=2048
> bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type users
Configs for user-principal 'user1' are producer_byte_rate=1024,consumer_byte_rate=2048
Configs for default user-principal are producer_byte_rate=1024,consumer_byte_rate=2048
</pre>
Similarly for (user, client):
<pre>
> bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type users --entity-type clients
Configs for user-principal 'user1', default client-id are producer_byte_rate=1024,consumer_byte_rate=2048
Configs for user-principal 'user1', client-id 'clientA' are producer_byte_rate=1024,consumer_byte_rate=2048
</pre>
<p>
It is possible to set default quotas that apply to all client-ids by setting these configs on the brokers. These properties are applied only if quota overrides or defaults are not configured in Zookeeper. By default, each client-id receives an unlimited quota. The following sets the default quota per producer and consumer client-id to 10MB/sec.
<pre>
quota.producer.default=10485760
quota.consumer.default=10485760
</pre>
Note that these properties are being deprecated and may be removed in a future release. Defaults configured using kafka-configs.sh take precedence over these properties.
<h3><a id="datacenters" href="#datacenters">6.2 Datacenters</a></h3>
@ -685,10 +745,11 @@ We do graphing and alerting on the following metrics: @@ -685,10 +745,11 @@ We do graphing and alerting on the following metrics:
<td>between 0 and 1, ideally &gt 0.3</td>
</tr>
<tr>
<td>Quota metrics per client-id</td>
<td>kafka.server:type={Produce|Fetch},client-id==([-.\w]+)</td>
<td>Two attributes. throttle-time indicates the amount of time in ms the client-id was throttled. Ideally = 0.
byte-rate indicates the data produce/consume rate of the client in bytes/sec.</td>
<td>Quota metrics per (user, client-id), user or client-id</td>
<td>kafka.server:type={Produce|Fetch},user=([-.\w]+),client-id=([-.\w]+)</td>
<td>Two attributes. throttle-time indicates the amount of time in ms the client was throttled. Ideally = 0.
byte-rate indicates the data produce/consume rate of the client in bytes/sec.
For (user, client-id) quotas, both user and client-id are specified. If per-client-id quota is applied to the client, user is not specified. If per-user quota is applied, client-id is not specified.</td>
</tr>
</tbody></table>

Loading…
Cancel
Save