Kafka brokers have a config called "offsets.topic.replication.factor" that specify the replication factor for the "__consumer_offsets" topic. The problem is that this config isn't being enforced. If an attempt to create the internal topic is made when there are fewer brokers than "offsets.topic.replication.factor", the topic ends up getting created anyway with the current number of live brokers. The current behavior is pretty surprising when you have clients or tooling running as the cluster is getting setup. Even if your cluster ends up being huge, you'll find out much later that __consumer_offsets was setup with no replication.
The cluster not meeting the "offsets.topic.replication.factor" requirement on the internal topic is another way of saying the cluster isn't fully setup yet.
The right behavior should be for "offsets.topic.replication.factor" to be enforced. Topic creation of the internal topic should fail with GROUP_COORDINATOR_NOT_AVAILABLE until the "offsets.topic.replication.factor" requirement is met. This closely resembles the behavior of regular topic creation when the requested replication factor exceeds the current size of the cluster, as the request fails with error INVALID_REPLICATION_FACTOR.
Author: Onur Karaman <okaraman@linkedin.com>
Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2177 from onurkaraman/KAFKA-3959