For as long as there have been networks, there's been a need to match network capacity to the load created by applications...
and their users. The cloud, especially hybrid and multi-cloud, complicates network capacity planning.
Application performance depends on available network resources for the sum of workloads, not just overall, but in all the places where there are specific limits to capacity.
Cloud is not just another hosting location. Workflows change as application components move and scale within the totality of cloud and data center resources. There are perhaps dozens of possible component configurations, each with their own workflows and their own effects on consumable resources. Cloud providers are connected to corporate networks through gateways, and these gateways vary in capacity and quality of service. Some workflows go through a virtual private network (VPN) from cloud to cloud, creating congestion from internal application traffic that's hard to see and control.
These three factors complicate a network capacity planning situation that can be complex enough without them. Corporate networks are typically VPNs, which are IP networks built on service provider tunnels using multiprotocol label switching and Ethernet service access connections. VPN sites have their own capacity, and a VPN overall has capacity limits. In addition, IP networks find applications and components through an IP address assigned from an address space. As workloads move and replicate to scale, an address can change, leading to inefficient traffic routing or even a lost connection.
The right configuration
The smartest configuration for hybrid and multi-cloud is one where corporate data centers are interconnected with local fiber or a data center interconnect (DCI) service from a carrier. The VPN connects users in all corporate facilities to the data center network, and the internet connects there as well. Cloud providers either create connections via the internet or through a dedicated gateway, but in nearly all cases, the connection goes to the data center network rather than directly to the VPN. Use this configuration as a baseline for network capacity planning.
The address space plan
When an application or a component of a distributed application moves or scales up, it needs a new IP address and capacity to route traffic to that new address. Every decision around workload portability and elasticity generates traffic on the data center network and the cloud gateway(s) involved. A workload's address determines how workflows through it connect, which defines the pathways and where to focus network capacity plans.
To plan realistic capacity requirements, formal network engineers dive into the complex math of the Erlang B formula, and if you are inclined to learn it, check out the older book James Martin's Systems Analysis for Data Transmission. However, there are also easier rules of thumb.
As a connection congests, it increases the risk of delay and packet loss in a nonlinear fashion. This tenet contributes to network capacity planning fundamentals. Problems ramp up slowly until the network reaches about 50% utilization; issues rise rapidly after that threshold. At 70% utilization, delay doubles, for example. Keep the connection, or gateway utilization, around the 50% level to avoid congestion during peaks. Unexpected traffic peaks often occur when a single transaction launches a complex multicomponent workflow and especially when traffic changes because of failover or scaling.
Size the DCI
The most significant network capacity planning decision is how to size the DCI network. It is the hub of all workflows, into and out of the cloud and to and from workers and internet users. The DCI network must never become congested.
The same workflow can travel across the data center network several times as it moves among components, which adds to the risk of congestion problems. Ensure that the total capacity of the data center and DCI network is at least three times the expected workload of all applications.
Open the gateways
Gateways link the data center, DCI network and clouds. For each gateway connection, aim for capacity of at least double the expected workload.
Develop realistic expectations related to workloads. Examine the redeployment-under-failure and scaling plans for each component of the application, whether it is a monolith or a collection of microservices. When an application workload can move to a cloud or spin up new instances there, add its workflow traffic to the expected overall load for that cloud.
Capacity planning for cloud connections is made difficult if applications access the cloud provider's services over the internet: Cloud traffic and internet traffic collide in the internet gateway. Cloud and internet traffic associated with the same transaction can flow back and forth through the internet gateway, increasing delays. To be safe, provide capacity equal to three to four times the expected traffic.
The network holds the cloud together, which means that, when you have a network congestion problem, you have a cloud problem, an application problem and a user quality-of-experience problem. Even with careful network capacity planning, seemingly simple decisions on cloud deployment and scaling can radically change workflows, traffic on critical network pieces and congestion. Always monitor the network and applications to detect these conditions, and launch a new capacity plan when it looks like expectations are missing the mark.