How Do You Recover From a Critical System Failure?
Backing up your data has always been an essential task when working with IT. However, with the ever-expanding and reduced cost of storage devices, the amount of data any one person or organisation has is something that has exponentially grown. Almost everyone at some point has dealt with the failure of a piece of technology where the loss of data has occurred.
What disaster recovery procedures do you have in place if major system failures were to occur?
The 3-2-1 Method
While there are many different methods and strategies to backing up your data, the timeless method commonly practiced is the 3-2-1 Method.
This method dictates to have:
Three Copies of Your Data
Three copies of data allow you to have redundancy if one copy of data becomes corrupted.
Two Different Mediums (Devices)
Two different mediums allow you to recover from other devices in the event of a device having a critical failure
One Offsite Location
An offsite location protects against environmental or physical threats as a physical backup on a disk in your office will not protect against a fire or a thief stealing all of your IT assets.
An example of this in practice, family photos on a computer would be considered irreplaceable data that cannot be re-created in a data loss event. Three copies of data can be achieved by:
- Having the live/production data (the photo itself)
- Backing up using an external hard drive (Windows Backup / Apple Time Machine)
- Syncing the data to an external cloud system (OneDrive / iCloud)
The two different mediums will be achieved by having the photo backup to both; an external disk & the cloud location.
And lastly, the offsite location is achieved by syncing data to the external cloud location.
Planning for Disaster
Does Your Generator Have Fuel?
Like a backup generator with no fuel, your backups need regular maintenance and testing to confirm that they will work when required.
This can be as simple as setting scheduled times to restore test files on the system.
In this instance, of system snapshots, restoring full server images to isolated virtual machines to validate they are working as expected.
Now that you have a running and tested backup, confidence in the worst-case scenario situations is relatively high.
But like any high stress or disaster event shows, in the heat of the moment, it is easy to panic and lose focus on your goals.
To ensure a streamlined and swift process, it is worthwhile having a disaster recovery plan. This document formally outlines the specific instruction for a disaster recovery event.
The detail that should be explicitly outlined includes:
Who is affected and have they been notified
Communication will be vital in setting expectations on your system’s status in a critical event. Identify who has been affected by the event and acknowledge the affected systems.
Recovery Time Objective (RTO)
RTO will define how long it will take to return services to normal. This can widely vary in value depending on the required level of recovery. As above, maintaining communication will be vital in ensuring all affected parties are up to date. Once this value has been defined, it should be communicated to all parties.
Recovery Point Objective (RPO)
RPO is focused on the acceptable level of data loss that will inevitably occur in a disaster recovery event. This value should directly relate to how often your backups are running, i.e. Hourly, daily, or even weekly, depending on the system.
Backup as a Service (BAAS) from Ducentis
Ducentis has invested considerable time in developing a BaaS Solution.
We offer both fully hosted solutions where critical client services and data are stored and protected in our cloud-hosted environment or hybrid solutions to provide local backups replicated to these same cloud services.
Author: Daniel Ring – Senior MIS Technician