Introduction

As technology advances, so does the need for better and more efficient web traffic methods. One such method is load balancing, which helps to distribute traffic across multiple servers. Google Cloud Platform (GCP) offers a load-balancing service with various features and benefits for businesses looking to improve their website’s performance and reliability.

Load balancing is an essential tool for managing web traffic. It ensures that your website remains available and responsive even during periods of high traffic. By distributing requests across multiple servers, load balancing helps to prevent bottlenecks and downtime. Additionally, load balancing can help optimize resource utilization, reducing costs and improving efficiency.

In this blog post, we’ll explore GCP Load Balancing in-depth, covering its types, benefits, and how to configure and use it.

What is Load Balancing?

Load balancing is the process of distributing network traffic across multiple servers to ensure that no single server is overwhelmed. It helps to improve website performance, reliability, and scalability by spreading the workload across multiple servers. Load balancing can be implemented at different levels, including network, transport, and application.

At the network level, load balancing can be used to distribute traffic across multiple data centres or geographic regions. At the transport level, load balancing can be used to balance traffic across multiple servers within a data centre. At the application level, load balancing can be used to distribute traffic across multiple instances of an application running on different servers.

Load balancing can be achieved through various methods, including round-robin, IP hash, and least connections. Each method has its advantages and disadvantages, and the choice of method depends on the application’s specific needs.

Why use GCP Load Balancing?

GCP Load Balancing offers several benefits that make it an attractive option for businesses looking to improve their website’s performance and reliability:

Scalability: GCP Load Balancing can scale to handle millions of requests per second, making it suitable for even the largest websites and applications.

Reliability: GCP Load Balancing is designed to be highly available and fault-tolerant, ensuring that your website remains accessible even during high traffic or server failure periods.

Customizability: GCP Load Balancing offers a range of features and options that allow you to customize your load balancing configuration to meet the specific needs of your application.

Integration: GCP Load Balancing integrates seamlessly with other GCP services, such as Compute Engine, Kubernetes Engine, and Cloud CDN, making it easy to build scalable and reliable applications on the GCP platform.

Types of Load Balancers in GCP

GCP Load Balancing offers four types of load balancers, each designed for different use cases:

Network Load Balancing

Network Load Balancing is designed to distribute traffic across multiple regions or data centres. It uses Google’s global network infrastructure to route traffic to the nearest available backend that can handle the request. This helps to minimize latency and improve performance. Network Load Balancing is ideal for applications that require low latency and high availability, such as gaming or streaming services.

Network Load Balancing supports both TCP and UDP traffic and can handle millions of requests per second.

HTTP(S) Load Balancing

HTTP(S) Load Balancing is designed to distribute HTTP and HTTPS traffic across multiple instances of an application running on different servers. It uses advanced algorithms to route traffic to the backend that can handle the request most efficiently based on factors such as server health, available capacity, and proximity to the client.

HTTP(S) Load Balancing supports both IPv4 and IPv6 traffic and can handle millions of requests per second. It also offers advanced features such as SSL offloading, content-based routing, and session affinity.

Internal Load Balancing

Internal Load Balancing is designed to distribute traffic across multiple instances of an application running within a VPC network. It uses private IP addresses to route traffic to the backend, ensuring that traffic stays within the VPC network and does not traverse the public internet.

Internal Load Balancing is ideal for applications that require high-speed and secure communication between backend services, such as microservices architectures.

Global Load Balancing

Global Load Balancing is designed to distribute traffic across multiple regions or data centres, similar to Network Load Balancing. However, it also offers additional features such as content-based routing, SSL offloading, and session affinity, making it suitable for more complex applications.

Global Load Balancing uses Google’s global anycast IP addresses to route traffic to the nearest available backend that can handle the request. This helps to minimize latency and improve performance.

How to Configure and Use GCP Load Balancing

Configuring and using GCP Load Balancing is straightforward and can be done using the GCP Console, CLI, or API. The steps involved include:

Create a target pool: A target pool is a group of backend instances receiving traffic from the load balancer. You can create a target pool for each type of load balancer.

Create a forwarding rule: A forwarding rule defines the IP address and port the load balancer listens to and forwards traffic to the target pool.

Create a health check: A health check monitors the health of the backend instances and removes any instances that are not responding or are performing poorly.

Configure the load balancer: Finally, you configure the load balancer to use the target pool, forwarding rule, and health check to distribute traffic to the backend instances.

Conclusion

In conclusion, GCP Load Balancing is a powerful tool for managing web traffic and improving website performance and reliability. It offers a range of load balancing types, each designed for different use cases and with a range of features and options to meet the specific needs of your application. Whether you’re running a small website or a large-scale application, GCP Load Balancing can help you achieve your performance and reliability goals.

Load Balancing Fact Sheet

Types of Load Balancing

There are 2 types of LB – Global and Regional

GLOBAL	REGIONAL
Global External Load Balancing	Regional external load balancing
HTTP(s) load balancing	Network Load Balancing
SSL Proxy Load balancing	Regional Internal Load balancing
TCP Proxy Load balancing	Internal Load Balancing

The following explains each type of load balancer available on GCP

Global External Load Balancer

HTTP Load Balancer

Global LB of HTTP traffic
Can configure URL rules
Traffic is routed to the closet LB instance group
Cross Region Load Balancer
LB is provided by 2 methods
- Requests per second
- CPU utilization
Session Affinity
- Client IP affinity
- Cookie affinity
Web Proxy Support (Web Socket)
- 30 second timeout set
- Timeout can be increased via API
LB Interfaces
- Gcloud CLI
- GCP Console
- The REST API
LB Timeouts and Retries
- Timeout 30 seconds
- TCP session times out 10 mins (600secs)
- API – retries GET requests not POST requests
LB Logged by Stackdriver
Server Firewall must be configured if used
LB does not keep instance in sync

Elsewhere On TurboGeek: Create a Shared VPC in GCP

Typical HTTP Load balancer setup

Figure 2 – https://cloud.google.com/load-balancing/docs/https/

Illegal request handling

The load balancer blocks the following for HTTP/1.1 compliance:

It cannot parse the first line of the request.
A header is missing the : delimiter.
Headers or the first line contain invalid characters.
The content length is not a valid number, or there are multiple content length headers.
There are multiple transfer encoding keys, or there are unrecognized transfer encoding values.
There’s a non-chunked body and no content length specified.
Body chunks are unparseable. This is the only case where some data will make it to the backend. The load balancer will close the connections to client and backend when it receives an unparseable chunk.

The load balancer also blocks the request if any of the following are true:

The combination of request URL and headers is longer than about 15KB.
The request method does not allow a body, but the request has one.
The request contains an upgrade header.
The HTTP version is unknown.

SSL Load Balancer (SSL Proxy)

SSL(TLS) connections terminated @ LB layer – then SSL LB balances the connections across all instances

Figure 3- https://cloud.google.com/load-balancing/docs/ssl/

Benefits
- Intelligent routing
- Better use of instances
- Certificate management
- Security patching
- Support ports 25,43,110,143,195,443,465,587,700,993,995
Component’s
- Health checking
- Backend services
- SSL cert and key
- Global forwarding rules

TCP Load Balancer

Same Properties of SSL proxy LB

Regional Load Balancer

Internal TCP/UDP Load Balancer

Internal LB scales services behind private LB IP accessible only to instances on VPC
Lower Latency (as within GCP network)
Supports Auto mode VPC, Custom mode VPC and Legacy Networks
Can be implemented with regional managed instance groups (enables auto scale across regions)

Figure 4 – https://cloud.google.com/load-balancing/images/ilb-high-level.svg

Figure 5- https://cloud.google.com/load-balancing/images/ilb-3-tier-web-app.svg

LB Selection Algorithm
- By Default, internal LB used 5-tiple hash
  - Client source IP
  - Client port
  - Destination ip (the LB IP)
  - Destination port
  - Protocol (either TCP or UDP)
- If you want to control backend traffic – use following options
  - 3-tuple hash (client IP, dest IP, Protocol)
  - 2-tuple hash (client IP, Dest IP)
Restrictions
- Internal to GCP only
- Cannot send traffic to VPN tunnel
- 50 rules max
- 250 forwarding rules max

External (Regional) Load Balancer

Figure 6- https://cloud.google.com/load-balancing/images/lb-cross-region-1-ipv6.svg

Balance load on incoming IP data – address, port, protocol
Routes traffic to multiple backend services
Consideration
- Load Distribution Algorithm
- Target Pools
- Session Affinity
- Health Checking
- Firewall rules and Network load balancing
Connection Draining
- Can be drained manually or by auto-scaler
- Must set timeout duration
- User sessions gracefully terminate, new session re-routed (1-3600 seconds)

Related TurboGeek guides

Post Views: 1,615

GCP Load Balancing