Backups that really work

  1. Why should I backup?
  2. What storage devices and media should I use for backup purposes?
  3. What files shoud I backup when?
  4. What backup software should I use?
  5. How can I be sure my backups really work?
  6. Implementation of the "perfect backup system"

Why should I backup?

Conclusion: you can't store data safely on a computer system without making backups.

If you value your data, you have to make backups. That is generally understood, but rarely accomplished. You can loose data due to human error (like deleting the wrong files, very common!), software failure, hardware failure, and virus initiated data loss.

Much of the hesitance about going paperless has to do with data safety issues. Theoretically, digital data are much safer than paper data - you can easily make multiple identical copies and store them at different locations - if you only would.

Theoretically, you would need backups of your paper data as well. Coffee is spilled easily over a page of paramount importance, a fire in your archive rooms can delete evidence of decades of general practice, and nobody is protected against the most common cause of data loss - misfiling paper documents, which may never be found again. Yet, nobody backs up paper data because it would be to expensive by many terms, and copies never would be identical anyway, just close to the original. Opposed to that we have digital data, cheap, simple and identical to backup.

After reading this page, you hopefully will backup correct and frequently, because you will know how to do it.

What storage devices and media should I use for backup purposes?

Conclusion: For our purposes in General Practice CD-writers / re-writers seem to be most suitable

Typical backup storage devices include ZIP disks, JAZ disks, tapes, mini disks, CD-ROMs, CD-RWs, DVDs, hard disks and others. A decade ago, floppy disks where used for that purpose, and some people still do. I hope you don't, because floppies are the most unreliable media available. New devices such as ORB drives seem to be tempting regarding their cheap price both for device and media, but their future is still unclear. They might go the same way as the once promising optical disks. In the future backing up off-site via the Internet will become more and more comon, but at present there are no fast and reliable Internet connections available outside the capital cities (Everything less than an ISDN connection is not an option).

The choice of backup media will depend on the amount of data to be stored, the price of the backup media, and their reliability. What people often forget to include in this list is the standardization of the media: a backup on a device or media which needs unsupported drivers, is dependent on a specific version of an operating system or specific backup software, is USELESS.

The optimal backup medium is reliable, cheap, standardized, widely available, likely to be used in the intermediate future, not dependent on specific environmental conditions (temperature, humidity), portable.

ZIP and JAZ disks are not cheap, and you would depend on one single manufacturer with an unclear future - therefore I would not use them. ZIP disks have to small capacity anyway for most of our purposes.

Tape drives to often depend on specific backup software, and the media are very fragile to environmental conditions including magnetic fields. Due to their large capacity sometimes there is no way around tape drives. Their access time is usually slow.

CD-ROMs, CD-RWs, and DVDs are ideal backup media: reliable, sturdy, cheap, generally available (DVD not yet, but soon), standardized (does not apply to early versions of DVDs), independend of specific operating systems and backup software. Their access time is acceptable.

Harddisks are good supplementary media for continuous backup and short term data storage. Their access time is unrivaled.

  price device / media availability of media (today / future) standardization (data format) capacity of media stability of media to environmental conditions (dust, temperature, humidity, magnetic fields) esase of use access time driver availability dependent on specific backup software
floppy disk +++ / --- +++ / + +++ 1.44 MB --- +++ + +++ NO
ZIP drive + / -- + / + + 120 MB + +++ + + NO
JAZ drive - / - - / - + 1-2 GB + +++ ++ + NO
QIC tape + / -- + / - - 40 MB - ?? GB - + - - YES, most new QIC formats
DAT tape --- / ++ ++ / ++ -- (unless TAR format is used) 1 - 40 GB - + - - YES, unless TAR format is used
other tape -- / --- - / --- --- ?? - + - --- YES
CD ROM (write only) + / +++ +++ / +++ +++ 640 MB +++ ++ ++ ++ NO
CD-RW ++/++ ++ / ++ +++ (UDF) 640 MB ++ ++ ++ ++ NO
DVD-RAM +/+ + / +++ +++ (future) 5.2 GB +++ ++ ++ (+) YES
harddisk (external or network) +++ / --- +++ / +++ +++ 10 - ?? GB ++ +++ +++ +++ (not needed) NO
off-site (Internet) ??? / probably the cheapes way in the future + / +++ +++ unlimited +++ depends on connection type depends on connection speed +++ (not needed) NO
mini disc ++ / ++ ++ / ++ - ?? ++ + ++ -- YES

What files shoud I backup when?

Conclusion: initial image backup of your system once set up, running & tested; infrequent snapshots of your system whenever changes are made to it, and at least daily backup of your data.

As I have explained inthe chapter about partitioning, if you have set up your system properly making backups is much easier and cheaper.

Your system (operating system plus application software) usually does not change very often once set up, therefore you take a couple of backup "images" after testing, and new images whenever you make a change like installing new software or device drivers. To take an image of your operating system you have to leave it, boot from another system (e.g. DOS from floppy), and run imaging software (like Drive Image or Ghost) to image your operating system. Just backing up with normal backup software or copying the files of your system partition will NOT enable you to restore a running system in the case of all Microsoft operating systems!!! You must use disk imaging software!!!
Your data (as data stored by your scripting software, pathology results, referral letters) is changing all day long. You can opt for continuously backing up (by hardware using a RAID controller and extra harddisk(s) or by software slowing down your system), or backing up at least twice a day e.g. lunch time and after hours. Some software packages do not allow making backups of their files while in use - avoid those packages or increase pressure on the software companies to address this issue.

What backup software should I use?

Conclusion: You need two different types of backup software; one for your system, one for your data

A good choice should be the software that comes with your operating system. If you have been using MS-DOS, you know that this is not always the case - the backup program included in MS-DOS was a complete disaster. The backup software included in Windows is actually not "made" by Microsoft, but a stripped down version of commercial software, lacks some vital functions (as system recovery), and has very poor driver support.

System backup:

As mentioned before, you need an imaging program to make an image of your operating system (in case of Windows including the application files due to the DLL and registry nonsense). Good commercial choices are Drive Image from Powerquest ( http://www.powerquest.com ), SnapBack, ImageCast, and Symantec's GHOST, now included in "Norton's System Works Professional Edition". Most of these programs can be downloaded as 30 days evaluation copy from the Internet - just ask your favourite search engine for the location. ImageCast is avaiolable at all TUCOWS mirrors. If you use an Adaptec SCSI card and Adaptecs EZ-SCSI software, chances are you already have their imaging software (which is not as flexible as the other two). I have not tried freeware or shareware utilities yet; if you try one, make shure you test the results repeatedly because your whole system recovery completely depends on this single software item. In my experience, no other image utility offers as much felxibility and independence from partition types and operating systems as Powerquests Drive Image, therefore I would definitely use it and no other program. To my knowledge it is the only imaging program that allows you to restore particular files from theimage instead of the whole image, if needed.

Data backup:

Data backup is simpler to archieve because all the backup software needs to do is copying files from one location to another. If you use UDF driver software such as Adaptec's DirectCD (download for $ at http://www.adaptec.com) if you are using CD-W or CD-RW, or Seagates "direct tape" to access your CD recorders or tape drives in the same way you would use a hard disk. Then you would use any backup program (such as the one included in your Windows installation CD) to backup onto these media.

As the duty is simple, any backup software will do that is able to copy a pre-defined selection of folders and files onto the backup media of your choice.

It has to be able to do so

unsupervised, automatically at scheduled times,
include a verification process for after hour backups,
allow for multiple retries in case of open files (or able to do force backup of open files).
store the files in a standardized file format (like TAR for tapes or UDF for CD's) that will allow other backup software to access the data.

If your backup software does not comply with any of the points mentioned above, look for a better one. If you haven't bought one yet, take this list and make shure the software will comply.

How can I be sure my backups really work?

Conclusion: You can't be shure before you have tried to restore your data from your backups.

You can't be shure before you have tried to restore your data from your backups. That is the sad truth. Therefore you have to restore your system from the image once installed - onto a different partition, and see how it works. Create worst case scenarios and then try to restore your system and see how it works.

With your data backups it is even more difficult. Let us assume something bad happened to your patient database - the file with all antenatal checkups is damaged, all data lost. You restore the files - technically the backup works - only to discover, that yesterdays backup already contained the same error! How long can you go back with your backup system? This scenario is quite common. It tells you that you have to keep several instances of your data backups, well labeled. One of the reasons why CD-R is such an ideal medium is that you are not tempted to overwrite a previous backup with a new one. At a price of $2 per 640 MB of uncompressed data and a very small place occupied in the shelf by a CD there is no excuse for not storing backups up to years back in time.

Implementation of the "perfect backup system"

Most of the IT text books recommend a "weekly backup of the whole system and a daily backup of your data", they don't mention imaging at all. Follow their recommendations and you will end up in trouble, sooner or later. This antiquated advice comes from times where people were using MS-DOS or similar systems, extremely easy to back up and restore. Times have changed. If you want a more reliable system, here some ideas how to set it up:

The server will have one CD-RW drive and a spare backup hardrive for backup purposes. It may have a RAID controller and a RAID disk array, but: only a negligible small amount of data is lost due to harddisk crashes - a RAID won't help you if your files have been accidentally deleted or modified, shredded by a virus, overwritten by a defect program etc . - and these are the common, frequent causes of data loss. It may have a large DAT tape drive attached to it if the amount of data makes that neccessary. On the server an application will run in the background backing up onto the spare harddisk in defined time intervals, like hourly. The server operating system will allow to backup open files and let this application run with a low priority in order not to slow down the network. Daily after hours the data files will be backed up onto the CDRs / DVDs/ tapes you might use. These backups are stored at at least two different locations (not in the same building) and kept as long as possible, at least one year - in the ideal case they are never disposed.

The administrator is the one and only person making changes to the server and he/she will - without exception - always make a system image after installing new software, updates or drivers. Again, images are stored at two different locations.

All workstations are connected to the server via a hub in star topology. All workstations have a "backup partition" on their harddisk. They have 3 batch files or macros:

one to make backups from the server data onto the local backup partition at defined time intervals
one to set up their application(s) to accept the path to the backup partition instead of to the server
one to reverse this back to the servers partition

When the server is down, they still may continue working with the data on their backup partition. Good application software allows you to merge the changed data back onto the server data file later when the server is up & running again. You may even have a fourth batch file / macro to do so.

Whenever software / driver changes are made to a workstation, the administrator will be notified and he will create a new image of the boot partition.

At chosen intervals, but at least weekly, one of the 2 newest backup sets is taken to a different location and stored there (definitely not in the same building). As all our confidential data is encrypted to the highest standards, there is no confidentiality issue.