|
One minute of system downtime can cost an organization anywhere from
$2,500 to $10,000 per minute. Using that metric, even 99.9 data availability
can cost a company $5 million a year.
The Standish Group
Internet commerce demands 24x7 Web site availability. We need systems that
provide zero planned downtime, so during routine maintenance, system
upgrades, etc., the system can still serve client requests. We also need
systems with zero unplanned downtime - one application server crashing or
one computer accidentally turned off should not stop the system as a whole
from working. We need to be able to handle far more concurrent requests than
one server can handle. To do all this we need clustering.
These objectives cannot be met by strict adherence to J2EE APIs. It's
not something the application does, but something the container or
infrastructure provides.
A cluster is a set of server nodes that cooperates to provide a more
scalable and fault-tolerant server infrastructure for stateful and stateless
components. To external clients, a cluster appears as a single server that
services requests with a single point of entry.
In this article we'll discuss clustering with regard to the J2EE
platform, focusing on the Web tier. We'll discuss the various tiers of
clustering, the mechanisms, and their performance/scalability ramifications,
and illustrate how to design highly available, scalable, fault-tolerant, and
performant systems.
Tiers of J2EE Clustering
Clustering is not limited to any single tier. It can be achieved in
every tier - cache, Web (servlet, JSP), EJB, and even at the database (see
Figure 1).
Caches sit in front of Web servers caching Web content and acting as
virtual servers. They reduce the response time, offload the back-end
servers, and help the Web tier handle the huge volume of client requests,
thus increasing scalability.
The Web tier sits behind the cache layer. In a typical J2EE solution,
the Web tier is made up of servlets and JSPs that are responsible for
dynamic content generation.
Beyond the Web server tier is the EJB tier that runs stateless session
beans, stateful session beans, and entity beans.
Even the last layer of the database can now be clustered using a
database cluster, as with Oracle9i Real Application Clusters.
Aspects/Facets of J2EE Clustering
The two major aspects of clustering are:
- Load balancing or front ending
- Fault tolerance or reliability
Load Balancing or Front Ending
The idea behind load balancing is to distribute the load (from client
requests) to multiple back ends. This enables a cluster (collection of
cooperating servers) to handle more requests/clients than an individual
server, thus providing scalability. A load balancer sits in front of the
server nodes and receives requests, then redirects these client requests to
the various back ends.
This distribution of the workload boosts performance (shorter response
time).
Individual servers in the cluster can go offline for maintenance
without causing the system to halt or fail (zero planned downtime).
This provides scalability because the cluster can handle more clients
than any individual server.
Group Membership
Server nodes need to be registered with the load balancer to have
requests routed to them from the load balancer (see Figure 2). These
registrations can happen statically or dynamically.
In static registration the load balancer is configured statically
beforehand with information about its target list of servers to register. To
change this list, the load balancer must be restarted.
In dynamic registration, server nodes register dynamically at runtime
with their target load balancer. This enables you to add new servers to the
mix dynamically without bringing down the load balancer.
Usually the dynamic registration is done using bidirectional
notification at startup.
Server nodes notify the load balancer at startup and register with the
target front end. The load balancer in turn lets all potential servers know
when it comes up so that server nodes seeking to register with it can do so.
The load balancer is the single point of entry to a cluster and this becomes
a single point of failure.
To add reliability to the load balancer and ensure availability, a
process monitor daemon can ping the load balancer; if the load balancer does
not respond, the daemon can restart it. The load balancer can be behind a
hardware load balancer that's using a virtual IP. To prevent a restart from
interrupting the request routing, it's important for the load balancer to be
stateless. In other words, all routing information for the session is
transferred as a cookie with each request.
Load Balancing Strategies
Random: As the name indicates, the requests get dispatched to back ends in a random manner.
Round-robin: Refers to a distribution on a round-robin basis. This can result in a bad situation: if we have two servers and our request pattern is a big heavy workload request followed by a light workload request. In a
round-robin situation, all heavy requests go to one back end and all
lightweight requests go to another.
Weighted: Any of the algorithms can have weights associated with them.
For example, I want twice as many requests dispatched to server A as to
server B. This implies a weight of 2/3 : 1/3 for A:B.
Equal request: Load balancer maintains a counter for each node ensuring all nodes receive an equal number of requests (or weighted).
Equal client: Requests get dispatched based on the client requesting it. This proves useful for stateful sessions. The load balancer sticks a session to a server node, which means all requests from a given session will
be handled by the same server node (in failover cases this will hold until
the server fails).
Equal workload: The load balancer keeps track of the workload handled by each server node by polling them at regular intervals, and weights can be adjusted for distribution. So workload statistics, if fed to an adaptive
weighted system on a load balancer, can load balance an equal workload.
Even though this may seem the ideal way of load balancing, the overhead
of calculating the workload results in performance hits that may make it
undesirable.
Failover and Reliability
When an application server in a cluster fails to serve client requests,
the load balancer reroutes the requests to its peer(s). This is termed as
failover and provides the basis for fault tolerance and reliability in the
cluster. Each server node of a cluster names one of its peer as its
secondary server node. If a server node fails, the load balancer finds its
secondary/secondaries and reroutes the requests.
The advantages of failover are:
- User is immune to system crashes
- Reliability
The disadvantages of failover are:
- Redundancy
- Performance hit due to replication
Replication
Failover is achieved through replication. Replication is done using
either:
- Point-to-point using direct socket communication
- Multicast the message to the entire group
Stateless session failover is tantamount to a simple request
redirection. Stateful session failover on the other hand requires both
request redirection and session replication.
Replication is a two-step process of transmission and consumption.
Transmission from the sending server VM occurs according to some strategy
while message consumption on the receiving VM occurs lazily. The bytestream
received is not deserialized until needed. Because it's not materialized
into the HttpSession, stateful sessions tend to be sticky. We see that the
entire system is an n serialize (serialization happens many times on the
transmitting side), or a 0 or 1 deserialize (message is consumed lazily only
if needed, i.e., in case of a failover) system. Stickiness of sessions is
done using the jsession_id cookie.
Replication Methods
Replication can happen over different transports.
Point-to-point transport replication is typically done for a single
primary and a single secondary. Here you typically use a TCP socket
connection between the primary and a secondary server. In case of a
failover, the client can actually failover to any node in the cluster.
Thereafter the node receiving the request can request the session from the
cluster and then the secondary will send the session and continue to serve
as the secondary for the new primary.
An alternative to point-to-point replication is to replicate using UDP
multicast. However, the primary concern with it is the reliability of the
message delivery. TCP packets have an acknowledgment built into the system
but, on top of UDP, application server vendors often build a NAK (negative
acknowledgment). Each packet can be sequenced with an ID, and if the
receiving VM finds that it received packets 1, 2, 3, and 5, then it knows it
missed packet 4. UDP-based systems can have multiple secondaries or an
island (as in Oracle9i Application Server) unlike point-to-point systems.
Since UDP-based systems are not waiting for acknowledgment, the replication
can happen asynchronously and so such a system can be more performant.
In this system, state replication happens within an island and load
balancing happens within and across islands. However, once a stateful
session is bound to an island, it should continue to work against the
island. A stateful session once bound to a back end sets cookies that
identify the back end as well as the island to which it was bound.
Web Application Failover
Web application failover requires that:
The application is marked distributable.
Objects in the HttpSession are serializable or remote.
Instance/static variables should not store state. Such variables do not
make sense when we failover to a different Java Virtual Machine (server
node).
ServletContext should not store state.
EJBs/database should store long-lived application state.
public void doGet (...) ... {
HttpSession session = request.getSession(true);
Cart cart = (Cart)session.getAttribute("cart");
if (cart == null) {
cart = new Cart();
session.setAttribute("mycart", cart);
} ....
In the Web clustering replication most application server vendors
currently replicate when setAttribute/removeAttribute is called on HttpSession.
This implies that if we do a setAttribute for cart, then update it and
don't setAttribute again, the secondary/secondaries won't have the updated
state.
There are times when you may want to use load balancing without
failover:
In pure performance when we're not prepared to take the hit of
replication
When we have nonserializable objects in HttpSession
When the requests are stateless
Common Pitfalls
In general, failover is transparent to the end user except in some cases
that I'll mention later.
Static variables in a servlet should not be used, but in most servlet
containers users can get away with using static variables. However, in
distributable applications, static variables will be reset in the
new/failed-over VM and will therefore be different from the values in the
original VM.
Most people assume that the init method of a servlet is executed only
once. This is true in one VM; however, in a failover scenario the init
method of a servlet executes once per failover (once on each VM).
Remember that in case of a failover, we go back to the last state in
which the primary did a setAttribute for each attribute. If after certain
updates a setAttribute was not done, we won't find those updates on the
secondary/secondaries.
Performance Optimizations
The price of plurality lies in replication and its related network
overhead.
The system is not n serialize - n deserialize. Almost all clustering solutions use n serialize 1 deserialize, so anything to speed up
serialization helps.
In the course of serialization a serialVersionUID computes the version
of an object and this can be a time-consuming operation that's being
undertaken for every serializationSerialVersionUID. This is a private static long defined in a class.
Preferably, provide a readObject/writeObject for the class being
serialized. This can save on time spent in reflection to get the
attributes/state variables to serialize. When writing a point object the
writeObject can just write the values of x and y; for the default
serialization mechanisms to serialize an object of type point, it must
reflect on the class Point and find out that x and y are the attributes to serialize followed by reflecting to obtain the values of x and y.
If a network is the constraining factor, it may be useful to find the
difference between previously serialized streams and transmit it. However,
this typically has the overhead of computing the delta or difference between
the streams.
Recommendations
Multisect the problem space so that the secondary/secondaries do not
have common causes of failure.
Use as much redundancy as possible.
Eliminate common causes of failure, such as a common source of power.
For a session, preferably pick a secondary so that both primary and
secondary are not on the same machine.
Summary
In this article, we looked into clustering, focusing on the Web tier. We
also discussed the two major aspects of clustering: load balancing and
failover. To provide high availability, load balancing, scalability, and
reliability, it's important to choose the correct application server
infrastructure and configure it appropriately. An insight into the
underlying mechanisms can prove invaluable. Application servers, such as BEA
WebLogic and Oracle9i, provide clustering. In subsequent articles we'll
discuss clustering in the EJB tier.
|