Tuesday 31 May 2011
Special performance related designs must be made when an infrastructure is to support special performance critical applications. Here are some tips:
- In general it must be known what the system is to be used for. A large data warehouse needs a different infrastructure design than an on-line transaction processing system or a web application. Interactive systems have other performance characteristics and need different infrastructure solutions than interactive systems or systems that must support high peak demands.
- In some cases special products must be used for certain systems. Real time operating systems, in- memory databases, or even specially designed file systems can be a solution for extremely performance sensitive systems.
- Most vendors of databases, web servers, operating systems, and storage or network solutions provide architects with standard implementation plans that are proven in practice. In general, try to follow the vendor's recommended implementation. It is also always a good idea to have the vendors check the design you created. Not only can they approve your design, they can also suggest improvements that you might not have considered. I have good experiences with having vendors check my designs!
- When possible, try to spread the load of the system in time. Maybe it is not such a great idea to have a complex batch job running at 09:00 AM when all people get to work and start their PCs. Making certain that a backup job is not scheduled when some critical report is compiled is also an example. Sophisticated scheduling applications exist to help you manage complex dependencies.
- Implement some form of Quality of Service (QoS) for important jobs. For instance on the network layer QoS can provide fast response times for interactive systems, while for instance sending email can be done at a lower priority (especially when the email contains large attachments). Most operating systems can be configured to handle certain processes with more priority than other processes.
- For availability reasons many systems have off-line copies of data available (for more information on this subject see the chapter on availability). If possible I/O intensive operations like running complex reports or aggregating data in data warehouses could be set to run on the off-line systems instead of on the production systems.
- To increase performance sometimes it is possible to move rarely used data from the main systems to other systems. Large databases tend to be slower than small ones. Moving old data to a large historical database can speed-up the production database that is now smaller in size.
Tuesday 17 May 2011
To make optimal use of a horizontal scaled system, most of the time some form of load balancing is used to spread the load over various machines.
Load balancing uses multiple servers in a system to perform identical tasks (also known as a server farm). Examples would be a web server farm, a mail server farm or an FTP server farm. A load balancer automatically redirects tasks to members in the server farm. A load balancer checks the current load on each server in the farm and moves incoming requests to the least busy server.
A load balancer also increases availability: when servers in the server farm are unavailable the load balancer notices this and ensures no requests are sent to unavailable servers until they are back online again. Of course the availability of the load balancer itself becomes very important in this setting.
It is also important to realise that server load balancing introduces new challenges. The systems must be 100% identical to each other in terms of functionality. For instance, each web server in a load balancing situation must be able to have access to the same information. Furthermore, the application running on a load balanced system must be able to cope with the fact that each request can be handled by a different server. The application must be stateless for this to work.
A typical example is a web application asking the user for a username and password. When the request is sent from web server number one, and the reply (the filled-in form) is sent to web server number two by the load balancer, the web application must be able to handle this. If this is not the case the load balancer must be made more intelligent; being able to contain the states of the application.
Of course if a server in the server farm goes down, its per-session information becomes inaccessible, and any sessions depending on it are lost. In the network realm, load balancing is done to spread network load over multiple network connections.
For instance most network switches support port trunking. In such a configuration multiple Ethernet connections are combines to get a virtual Ethernet connection providing higher throughput. For instance a network switch can trunk three 100Mb/s Ethernet connections to one (virtual) 300Mb/s connection. The load on the three connections is then balanced over the three lines by the network switch.
In storage systems multiple connections are also common. Not only for increasing the bandwidth of the connections, but also to increase availability.
Tuesday 03 May 2011
Scalability indicates the ease in with which a system or component can be modified, added, or removed, to accommodate changing load. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system.
In general there are two ways of increasing the scalability of a system: vertical scaling and horizontal scaling.
Vertical scaling (also known as scaling up) means adding resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer (“To make the system faster, just add more memory”). The problem with vertical scaling is that there is always a limit to the amount of scalability of a system. There is only so much memory that a system board can support. And the effect of adding more resources to a single system can quickly get very expensive.
An alternative to vertical scaling is horizontal scaling. Horizontal scaling (also known as scaling out) means adding more nodes to a system, such as adding a new web server in a pool of web servers or more disks in a disk array. These days low cost "commodity" systems can be combined to perform tasks that could in the past only be handled by supercomputers.
In general larger numbers of computers means increased management complexity, as well as a more complex programming model and issues such as throughput and latency between nodes. But while horizontal scaling is more complex to implement than vertical scaling, but it pays off in the long term, as the scalability is much higher.