In this architecture, the entire application is run within the active region,
Applications
cannot be brought up automatically. Do not use a stretched cluster when fewer than three fully-operational regions are available. RPO is greater than 0 because
disappear. in each fully-operational region. Run this command to monitor the status for the reassignment: Run this command to finish the reassignment: For more information and examples, see Auto Data Balancing.
For instance, if there have been enough broker failures for a given partition to be below its
ZooKeeper deployment might look like this: Note that Kafka brokers do not necessarily need to be deployed in each data For example, some customers have one primary (the active) spread across multi-AZs.
Spoke clusters all replicate clicks and impressions Try it free today. To minimize possible data loss caused by unclean leader election, monitor observer replication to make sure that it is not falling too far behind.
Confluent Cloud and The tutorial injects latency and packet loss to simulate the distances between the regions and showcases the value of the new capabilities in a multi-region environment. configured for each broker.
Confluent Server is often run across availability zones or nearby datacenters.
In all three regions, deploy ZooKeeper. Three of the replicas will be sync replicas
that all the data will be replicated to the surviving second datacenter if the first one fails. Awareness for AZs tutorial, 2 broker pods in AZ 1, 1 broker pod in AZ 2, 3 broker pods in AZ 3, 3 broker pods in AZ 1, 1 broker pod in AZ 2, 2 broker pods in AZ 3. The confluent-rebalancer command line tool supports reassignment that also accounts for replica placement constraints.
Stretched cluster architectures are most commonly used for applications located document.write(new Date().getFullYear()); achieve an RTO=0. To learn more, see Auto Data Balancing. failure at a time. Do not use a stretched cluster when regions are greater than 50ms of network latency apart, for example inter-continent. it will result in data loss. availability zones (multi-AZ).
Consider using this architecture when only two fully operational regions are In order to have zero RTO, seamless client failover is required. local Confluent cluster. Copyright Confluent, Inc. 2014-
Often this architecture is chosen by on-premises customers who have limited access become available or scheduling pods to a different AZ. The region, producing and consuming from the passive region.
They will be in the pending state. An HA application has a disaster recovery policy. specified client.rack ID. cross-region replication is asynchronous. East may make up a single stretched cluster. Historically there are two types of replicas: leaders and followers.
replicated to a single, probably larger, aggregate Confluent cluster. This behaviour is controlled by the observerPromotionPolicy field in a topics replica placement policy.
application in a piecemeal fashion to Confluent Cloud.
With Multi-Region Clusters, clients are
With Rack Awareness, cluster.
For example, on a Kubernetes cluster with the following worker node spread: If you schedule 6 Kafka brokers, the pods would scheduled as one of the This will decrease round-robined fashion. A multi-region cluster deployment across three Kubernetes regions, where each With observers, you can define topics that synchronously replicate data within one region, but replicate the data asynchronously between property. Automatic observer promotion is only available with Replica Placement version 2 introduced in Confluent Platform 6.1. must be deployed in both fully-operational regions in order to achieve an RTO close to 0.
If you are deploying multiple multi-region clusters utilizing the same
multiple Kubernetes clusters running in different regions as a single identifies location of the broker. document.write(new Date().getFullYear());
the overall rebalance scope and therefore time. shown below: Once deployed, you can check to see which AZ the persistent volume is assigned
See Manage Confluent Role-based Access Control. each running a separate Confluent cluster, where each cluster is a copy of the
In a multi-AZ Kubernetes, the Kubernetes worker nodes are spread across AZs. For example, the following displays information about all replicas that constitute the testing-observers topic for the first partition: To try a detailed Multi-Region Clusters example, run the end-to-end Tutorial: Multi-Region Clusters.
all allow an application to recover from a disaster, and in the context of this document, a full region failure.
Cluster in each store (the spokes), supporting local applications.
Copyright Confluent, Inc. 2014- For example, if you have a cluster spanning us-west-1 and us-west-2, and you lose all brokers in us-west-1: If a topic has replicas in us-west-2 in the ISR, those brokers would automatically be elected leader and the clients multi-region cluster of each Confluent component.
run against a cluster in a specific context that specifies a Kubernetes
The spoke regions run regional ad-serving and ad-tracking applications with a RPO is greater than 0 because cross-region Confluent Server includes a command Copyright Confluent, Inc. 2014-
region failure. followers have been restored (they are caught up and have rejoined the ISR) then the observers are automatically demoted
To configure Rack Awareness, see configuring Rack
clients or later will then read from followers that have matching broker.rack as the RTO can be very low because you have at least one region
CFK supports various multi-region cluster deployment scenarios for Confluent Platform.
For example, if the pod that is on the failed AZ is kafka-0, and the Kafka Each node is labeled with the region that it is in. RPO is greater than 0 when inter-region replication is not guaranteed to be synchronous.
Confluent cluster and application.
regions simultaneously. Otherwise, RTO is greater than 0. with the required resource capacity in the same AZ, and will attach to the same specifically, GKE, EKS, and AKS, see Multi-region Cluster Networking
except that the original application might be using an alternative messaging system. Many consumer banks have low or zero RPO and RTO requirements and customers only
largest offset that was acknowledged. Confluent Cloud and Confluent Platform are often used across multiple regions (data centers), for servers and the hub cluster is performing game analytics. Umbrella term that encompasses architecture, implementation, tooling, policies, and procedures that
servers. Use the below decision tree to understand what architecture is best for you.
on-prem, sometimes three fully-operational regions are not available. A stretched 3-region cluster architecture involves three regions that are all components on Kubernetes.
You are welcome to contact Confluent
A single message produced or consumed to/from Confluent Cloud or Confluent Platform. In the context of For more information see Replica Placement. Instead, this Note that these clusters could run in the same region or separate regions. The ZooKeeper ensemble should be deployed so that if a network
across regions. examples GitHub repo, Multi Data Center Setup
Try it free today. to Confluent Cloud, use a replication architecture to migrate events and components of the Retailers with brick-and-mortar stores and e-commerce often operate a Confluent
Please contact Confluent Apache, Apache Kafka, Kafka, and associated open source project names are trademarks of the Apache Software Foundation, kafka.cluster:type=Partition,name=ReplicasCount,topic=
However, if the entire region
This sections refers to the failure of all replicas for a partition. document.write(new Date().getFullYear()); This dramatically reduces the amount of cross-datacenter traffic between clients and brokers. So, you should carefully consider Note that MRC is only available in Confluent Platform. Some can be done in one region and get propagated to all regions, and
region will have one ZooKeeper host. To mitigate this, three distinct pieces of functionality were added to Confluent Server: Before the introduction of this feature, all consume and produce operations took place on the leader. A 2-region active-passive architecture involves two fully-operational regions,
startups dont have SLAs that require low RPO and/or low RTO during a full
use a source Connector or Bridge designed to read from the alternative messaging system and
or networks, whereas other events can or must be available in separate regions or networks.
Kubernetes clusters. For example: The storage used by worker nodes are available across all AZs. that events may be restricted in certain regions for reasons other than network security. The pvc is named as data0-
to multiple regions. The ZooKeeper node id (mrc in the above example) can be any string, but it the producer. It can have values: The default observerPromotionPolicy is under-min-isr which is a change from the default version 1 replica placement behaviour.
Further, only consider this architecture when the application can be run in two
For example, in the United from the ISR. A stretched 2.5-region architecture involves two fully-operational regions and The most common environment where stretched clusters are deployed is in public Using distinct Kafka broker ids and component server ids (ZooKeeper and Schema Registry) across
replicated to and from a hub cluster, for example for analytics and inventory updates. You can create and manage rolebindings in any region in a multi-region
Alternatively, with observers in the second datacenter only, there are no guarantees on any worker in a zone that has capacity. The banks: low or zero RPO and RTO, customers within a single country or continent, A and access to only two fully-operational regions and one light region.
PersistentVolumes are selected or provisioned
they are run against the correct cluster. Ports, or static host-based ingress controller routing. Refer to the use cases for this architecture to learn scenarios when this
failover need to resume from?
cluster hosts CFK, Kafka brokers, and ZooKeeper servers: A multi-region cluster deployment across three Kubernetes regions, where each With follower-fetching, clients can also consume from observers.
cluster using annotations in the ZooKeeper CR. To handle partial failures automatically please see Automatic Observer Promotion. cloud regions with three availability zones (AZ). For example, the following replica placement JSON is invalid.
To use version 2 specify "version": 2 in your replica placement JSON. for guidance when considering these architectures. use cases and recommended high-level implementations.
When a fully operational region fails,
This can result in the truncation of all the topic partition logs to an When one of the fully-operational regions fails, if applications are running in
You can use kube-dns or CoreDNS to expose region clusters DNS. during normal operating conditions. This architecture provides RPO =0 or >0 depending on configuration, and RTO=>0.
When the active region fails, applications failover to the passive region. Each partition will be assigned five replicas.
Ensure that the storage class you use has In the Kafka CRs, specify a comma-separated list of ZooKeeper endpoints in running at all times. of Schema Registry. For example, run the commands below to start a reassignment that matches the topics replica placement constraints. Sometimes startups or early Confluent adopters choose this architecture as a short-term
In an unclean leader election, it is possible for the new leader to not have all of the produced records up to the fully-operational regions. If you run Auto Data Balancing or Self-Balancing Clusters, the preferred leaders will be redistributed among all racks that have replicas. Deploy an equal number of Confluent Server hosts maintain. This topic describes the behavior and practices for Confluent for Kubernetes in multi-AZ for guidance when considering these architectures. zones or nearby datacenters is dissimilar, in term of reliability, latency, bandwidth, or cost, this can result in higher latency, and deploy the fifth in the light region. replicate partitions from other brokers before it can start accepting read/write and those topics are accessible from all the regions in the cluster.
those pods will be brought up. In a five-host ZooKeeper deployment, two regions will have two ZooKeeper hosts, and the third Electing a replica or observer as
In this case, the Confluent cluster Commonly the third, light region is a
which now includes --topics and --exclude-internal-topics flags to limit the set of topics that are eligible for reassignment.
Do not use this architecture when regions are greater than 50ms apart, or when automatically redirected to another region.
example with geographic DNS routing. configured for Rack Awareness, where the racks are AZs. onsite to support local applications (the spokes), and a hub cluster for analytics. A 2-region active-active architecture involves two fully-operational regions,
Confluent Cloud is a fully-managed Apache Kafka service available on all three major clouds.
A highly available system can operate continuously even amidst failure.