Do you know how redundancy provides fault tolerance? You should if you plan to take the Security+ exam. Redundancy adds duplication to critical system components and networks. This post should help.
Redundancy into Systems and Networks
One of the constants with computers, subsystems, and networks is that they will fail. It’s one of the few things you can count on. It’s not a matter of if they will fail, but when. However, by adding redundancy into your systems and networks, you can increase the reliability of your systems even when they fail. By increasing reliability, you increase one of the core security goals: availability.
Redundancy adds duplication to critical system components and networks and provides fault tolerance. If a critical component has a fault, the duplication provided by the redundancy allows the service to continue as if a fault never occurred. In other words, a system with fault tolerance can suffer a fault, but it can tolerate it and continue to operate. Organizations often add redundancies to eliminate single points of failure.
You can add redundancies at multiple levels:
- Disk redundancies using RAID
- Server redundancies by adding failover clusters
- Power redundancies by adding generators or a UPS
- Site redundancies by adding hot, cold, or warm sites
Redundancy to Eliminate Single Point of Failure
A single point of failure is a component within a system that can cause the entire system to fail if the component fails. When designing redundancies, an organization will examine different components to determine if they are a single point of failure. If so, they take steps to provide a redundancy or fault-tolerance capability. The goal is to increase reliability and availability of the systems.
Some examples of single points of failure include:
- If a server uses a single drive, the system will crash if the single drive fails. Redundant array of inexpensive disks (RAID) provides fault tolerance for hard drives and is a relatively inexpensive method of adding fault tolerance to a system.
- If a server provides a critical service and its failure halts the service, it is a single point of failure. Failover clusters provide fault tolerance for critical servers.
- If an organization only has one source of power for critical systems, the power is a single point of failure. However, elements such as uninterruptible power supplies (UPSs) and power generators provide fault tolerance for power outages.
Although information technology (IT) personnel recognize the risks with single points of failure, they often overlook them until a disaster occurs. However, tools such as business continuity plans help an organization identify critical services and address single points of failure.
Remember this
A single point of failure is any component whose failure results in the failure of an entire system. Elements such as RAID, failover clustering, UPSs, and generators remove many single points of failure. RAID is an inexpensive method used to add fault tolerance and increase availability.
You might also like to view these posts:
Security+ Practice Test Questions
RAID and Security+ (RAID-0, 1, 5)