Hard Lessons with Hard Drives


Good Gear




The Photography Blog

Photographic Field Guides
Practical Philosophies
Careers and Ideas
Inspiring Journeys


June 2012

70mm
1/200th @ f/13.0
ISO 100
Canon EOS-1D Mark IV

173


Hard Lessons with Hard Drives
This is the 2020 update to my popular article on why and how to make sure your backups are sorted. Cloud storage is not always the answer when big RAW files and big photo collections are at stake. I spend more on hard drives than cameras, so read on to find out why.



The Photography Blog




The importance of backing up your photos never really hits home until you’ve lost a bunch of photos. It could be an SD card failure, it could be a mix-up during a confusing download sequence, or maybe a portable disk that doesn’t make the flight home in working order. They use the term “Data Loss” but that sounds so vague and pallid, it fails to convey the emotional state of sitting on a hotel bed trying to work out where 4 days of a clients photos have disappeared. Or the anxiety of having to potentially explain it and maybe redo the entire shoot.

If you love your images, keep them safe. Keep them safe from failures of technology. Keep them safe from acts of human stupidity. Keep them safe from fire and theft.

That last one is a pickle without serious resources or at the very least a technologically aware family member. But it is possible.


What Can Go Wrong

Several years ago I had a spate of disks go bad on me. A laptop drive failed, then a disk in my server, then a portable drive I use for backing up while travelling. In all I lost 8 drives in a 12 months period. Hard drives were once a premium purchase, an investment in reliability. But the competitive market at the time turned them into a commodity item and that assurance of reliability was no longer in effect. We think of digital media as being a life-long solution, but in reality they are measured in months instead of years.

Optical media was once thought of as archival. How often in your life have you pulled out a CD to play and realised it was scratched or just no longer working? Storing data on CD or DVD has the same issue of expiring bits, with the added issue of very small capacity. Our hunger for pixels has grown such an appetite that the storage offered in optical is just not worth contemplating anymore anyway. Disk storage is king because it’s big and it’s fast and it’s cheap.

And when I say “disk” I no longer mean disk. “Solid State Drives” (SSD) are the new heroes for their speed and reliability. No moving parts to fail, just a bunch of persistent memory cells packed into a product that takes the place of a traditional hard drive. They work extremely well these days and are much more affordable. But take note, SSDs also have a finite life span.

In fact all storage is temporary, and that must be how you think about any solution to your data security. Whatever you have right now will need to be moved to a newer device later. Usually this happens when we buy a new computer, or a bigger drive. We don’t think about the ageing of our hardware because we’re buying something new for other reasons. All storage is temporary and will fail eventually.


What is RAID and NAS?

This will be the most technical part of the article, but for many very useful. Please hang in there with the boring stuff :) Network Attached Storage = NAS. This is a big box with hard drives inside and a server onboard so that the disks can be accessed across your local network. This can be wifi or ethernet. What matters here is that you don’t plug the NAS into your computer, you simply access it over the local network. This is good and bad. If you only use wifi it will be much slower than an internal or USB drive, although gigabit ethernet does a pretty good on faster NAS devices.

You can put the NAS box anywhere in the house and kind of forget about it. You access the device through a web browser and these days they have lots of connectivity. They can ping your mobile to let you know if a drive is full or unhealthy. You can even setup remote access to the NAS to dig out files while travelling. They are proper servers with the benefits of cloud connection. Most NAS boxes these days are easy to manage and offer their own ecosystem of applications and features. You don’t need to be an IT expert to manage them. I currently use a QNAP 8-bay device with 22TB of storage that can tolerate two failed drives.

RAID is how a NAS box stores data inside and the acronym refers to having an array of disks that achieve redundancy in the event of an individual disk failure. In other words, RAID means that if one disk fails I can still retain my data. For the sake of simplicity I will talk about NAS boxes with either 2-bays, 4-bays or 8-bays.

2-bays allows you to implement RAID 1, which is when you simply mirror the data on the 2 drives. If your 2-bay NAS has a pair of 4TB drives you DO NOT get 8TB of storage, you get just the 4TB but if either of the drives die you can still keep working.

4-bays allows you to implement RAID 5, which uses a more complex algorithm to achieve redundancy while maximising storage capacity. If your 4-bay NAS is loaded with 4TB drives you end up with 12TB of storage that will tolerate any single disk failing.

8-bays allow you to implement RAID 6, which is similar to RAID 5 but adds another degree of security at the expense of capacity and tolerates any two drives to fail without losing your data. An 8-bay NAS fully loaded with 4TB drives will yield 22TB of storage.


Cloud Based Solutions

Bollocks. Most of us will never have sufficient internet speed to seriously make use of cloud storage beyond gigabytes, or sufficient budget to make permanent cloud storage of our life’s work a reality. It’s a problem of scale. Cloud solutions are great for documents and the limited scale of essential digital assets. You can have 200GB of data on your laptop synced with a cloud storage solution (Dropbox, Google Drive, etc) and enjoy the confidence that if your device dies or is lost, you still have all your work accessible. Plus you can sync to the desktop or access on your smartphone.

200GB is not enough for a photographer though. I can shoot that in 10 minutes, and in reality I need Terabytes. My data consumption gets worse the busier my schedule is, as I start to fall behind in processing RAW files and haven’t got the time to cull the duds. I should keep roughly 5-10% of my images, but I still have photos taken in China a decade ago that need a proper run through for processing and curation.

Most of us will easily fill a 4TB disk, and many will fill 10TB and the more dedicated will go waaaay beyond 10TB. The absolute cheapest option to store 16TB in the cloud is AU$50 per month. The cloud is good for many many things, but it is not a great option for storing a decade worth of 50MP RAW files. I will come back to cloud storage later in the article though and how it might become useful for bigger roles.


Master Copy

The other issue with cloud storage is that you need to be ready for that day when your cloud solutions closes down the service. In short: always have a local copy of everything. One reason we love Dropbox and the other file sync alternatives is that all your files are still on your computer. You can see them and manage them just as normal, and if Dropbox suddenly stops working one day you still have your copy where you need it.

Safely storing your photos should be similar, just on a bigger scale.

The worst thing you can do with your RAW files is to scatter them across many hard drives. Many people do this to reduce the risk of losing everything, when in fact they are increasing the risk of losing something. Each drive is a potential point of failure so if you have images spread across ten drives you have increased the likelihood of losing a bunch of photos by that factor of ten. The answer is to put everything on a single drive and then back it up.


Data Flows

In reality I run two data flows, one for getting my work done and another for keeping my work safe.

The first is working storage; the fast stuff connected to my computer. This may be internal SSD or external SSD, but it will be SSD. The newest Macs offer disk speeds on the internal SSD that is an order of magnitude faster than the fastest external SSD - and they were already very very fast. Disk speed is critical when digging out the winning RAW files from a big photo shoot or an editorial set.

When the jobs are done I send them off to our big disk. We use a NAS box to store our valued work and keep it safe. The NAS is big and slow but it’s also very clever and can tolerate a lot of disk failure. NAS stands for Network Attached Storage and the box does exactly what the label says. We access it over the wifi network and can quickly locate old content from past travels and clients if need be. Finding stuff is quick, but if you have 200GB of RAW files from a particular event you want to reload into Capture One, you’re best to move it back to an SSD and work locally.

NAS devices can also be accessed over an ethernet cable with GB speeds, or you can just plug in an external SSD to the NAS box directly and move files even faster. The essential thing to remember about these devices is that a) they are easy to access over the network and b) they offer RAID technology inside the box to protect your data.


Keeping Data Safe

Safe from what exactly? Backups are not merely insurance for failed hard drives, although for many of us that is the number one issue, but can be useful insurance for own stupidity as well. RAW files are not dynamic in the way your business receipts are, and yet the treatments and processing you apply to them may well have a value and currency that extends the original RAW capture. We want to preserve both the processing of the file and the RAW original itself, so accidentally deleting an image library or folder with all the Capture One edits can be very painful.

If you have a backup copy of the drive you can access that version and rollback the files you deleted by mistake.

And this is what you need to recognise about RAID solutions as distinct from a backup disk: RAID keeps your data synchronised across many physical disks, which means if you delete or overwrite RAW files or the edits, that data loss is synchronised as well. RAID is a solution to manufacturing mistakes, not a solution to your mistakes. The same issue is relevant for Ransomeware. If your files are impacted by a virus that locks down all the data, then your NAS box will faithfully synchronise that locked-down version as well. I should note also that most ransomeware attacks target organisations rather than people, and they tend to pick out municipal level infrastructure where decades of budget savings have left security holes in close proximity to large budgets.

If you do need a physical roll-back option, you need a separate backup device. Even then, you are limited to the backup window to achieve a roll-back. Imagine you have a 10TB disk for storage and a separate 10TB backup disk, and once a week at midnight on a Sunday the drives are set to synchronise the backup. If you delete a really important folder by mistake on a Tuesday, you have a backup on hand to save the day. But if you don’t realise the mistake for a week or two, the back window has already gone and your deleted data with it.

This is why Time Machine on the MacOS is such a powerful tool, as it records the incremental changes to a computer over time and allows you to step back through various backup points. This works well on a single computer with lots of small files that change. To implement the same approach to a 10TB disk of RAW files would need maybe as much as 20TB of backup space. It’s doable but expensive.


Offsite

If you do have a NAS box or a backup drive the next problem is the risk of fire and theft. That big NAS box with all the redundant hard disks is keeping your data safe from disk failure, but if someone breaks into the home and steals the NAS box then you are left with nothing. Same for a fire. Most people who use a backup drive instead of a NAS box keep it in the same house as well, so the risks here are very similar.

Time Machine backups are a potential winner here. You can run multiple sets of backups instead of just one, and rotate the backups with a friend or family member. It means having two or more spare 10TB disks instead of just one, and it means implementing a rotation schedule. I used to do this a decade ago before I moved to a NAS solution. I had a backup drive permanently connected to my file server and would swap it out every month or so. This gave me a wider backup window for roll-backs, and gave me a third copy of my data in case my first backup failed. This did happen to me once, and I recall driving across Melbourne with my final working copy of a hard drive that contained my entire life's work.

Rotating backups is not a great solution for most of us because the reality is you are unlikely commit to switching out disks very often, if at all. Humans are not good at repetitive tasks. Humans are not good at repetitive tasks.


Cloudy Offsite

In a perfect world cloud based backups would be practical and affordable. But as I’ve discussed our world is far from perfect. My ideal situation would be a single NAS box with everything of value archived away with RAID protection from drive failure, plus a cloud storage service that lets me store the contents of that NAS box offsite with regular synchronisation. There are options here that are worth looking at, depending on your volume. Getting 10TB on the cloud will cost anywhere between AU$50 and AU$500 every month. Many of these solutions offer version control, which is a granular way of achieving roll-back and very valuable in the event of a Ransomeware attack.

Price wise these solutions are worth looking at. If you have to buy $1,000 worth of drives each year to feed your expanding NAS box then maybe some of those dollars can be better spent. Have a look at Mega, Wasabi, ElephantDrive or BackBlaze B2 for a start on NAS to the cloud backups.

Assuming the price is in your ballpark, the next problem is the speed of getting your data onto the cloud service, and more importantly, the speed of getting it back again in the event of a data emergency. How long does it take to get up and running again if your local storage fails? Some cloud services recognise this issue and offer the option of a “seed disk”. They send you a big drive and you fill it with the initial copy of your data which will seed the cloud storage service. From then on you synchronise online, with the option to get back a big disk in the event of disaster. BackBlaze call their product “Fireball” and you essentially rent their hardware for as long as you need to seed with your data or recover with your data. US$550 gets you 30 days of rental.

The idea of paying a fee for recovery of your data is pretty good value I think, so long as the storage itself is priced at a minimum. For Australians however the cloud storage landscape is a problem as the affordable options are based overseas and this puts a crimp in the practicality of seed disks. BackBlaze sounds like a great option if you need a lot of storage, but the cost of seeding the service and taking on the risk for a 20kg seed drive (yes it’s actually a NAS box of course) is prohibitive.

Another solution to use the cloud but keep the local RAID storage is to duplicate your NAS box, and have the second box stored at another address. That’s now double the price plus you need the right kind of family or friend to make it happen. Maybe you have another friend who needs an offsite backup too, and you can backup each other? If you have two of the same NAS systems in place they will have features that allow them to replicate and synchronise between each other. You can do the initial sync on a local network, and then relocate the remote NAS to utilise broadband connections for ongoing syncs. In the event of data failure you retrieve the remote NAS and get running again quickly.

It’s very practical to do this, but I have never met a single person myself who has chosen to implement it.


The Short Version

Backups are essential. One backup may not be enough. RAID inside a NAS box is simple and effective. Offsite is hideously expensive to implement, especially in Australia.

This is a lot to digest and even more to execute. There’s an old saying amongst motorcyclists about how much to spend on a helmet: “It depends how much you think your head is worth.” You might need to shed AU$1600 to achieve a redundant copy of your 12TB collection. And then you’ll go and shoot another 4TB next year when you start shooting everything on 50MP RAW files and 4K video. Note, that for an extra AU$1200 a year you can have that stored safely offsite with the BackBlaze B2 service.

The most practical solution is likely a combination of technologies, which relies on you being organised enough to decide what is essential for offsite backups and what is not. If you offsite the entire NAS box, it's slow and expensive. If you can limit the offsite to a few TB then you can shop around for an option that fits your NAS and your budget. I probably should have put ORGANISED in all caps.

My suggestion is start looking at your options, today. If you haven’t got a backup please get one. If you have a pile of drives sitting around your desk, the best step is to plan to get rid of them all. Work out if you need 4TB or 10TB or more. Look at a NAS box with lots of bays and the ability to implement RAID level 5 or 6. And don’t just think about what you need now, think about how quickly you will fill that NAS box in the coming year or two.

Spend the time to plan a solution. Then spend the money. Then breath easy. Having your data secure is not wasted money, it’s peace of mind and financial security for your business.



Please Share Your Thoughts



JUST THE FACTS



QNAP 4-Bay

qnap.com

Here’s one suggested setup for you that will give you 12TB of storage to survive a single disk failure and keep rolling. You can buy this NAS box for under $800, and a set of 4x 4TB disks for another $800. I suggest the Western Digital RED drives for NAS because they are designed to cope with the stresses of being loaded into a server. I avoid Seagate because they cut corners on quality.

The marketing team have added tonnes of features to these NAS devices, like being able to stream photos and video to your Smart TV or using an App on your phone to browse the contents. Some are powerful features, others just clog up the interface. But the QNAP family is very effective, easy to setup and the disks are hot swappable. I have survived hot swapping a sick drive on the QNAP with zero downtime and no stress. Have also migrated from a 4-bay to an 8-bay with QNAP and it recognised the disks for me and made it lovely and simple to expand.
This feature was last updated on Tuesday 12th May 2020

Copyright: All images and words on this web site are copyrighted and may not be used without permission.
Feature written by / Ewen on Google

Related Links
  Global  Good Gear  Hard Drives  Hard Disk  NAS  RAID  Disks  Drives  Storage  Backups  Time MAchine  Cloud Storage  Seed Disk

Very selected features on the hardware, software and extra wares that help me get the job done.

Keeping The Lights On

Keeping The Lights On

One of the great advantages of shooting with flash is you can dial-up the lighting you require any time of day, or night. It’s a seriously big advantage over natural light, consistency and quality. For the last few months we have set aside our usual flash rig for something a little bit different. Continuous LED lighting is here, and it might just be useful.


Get Sorted

Get Sorted

As a bonus for my readers we have Paul Dymond as guest writer on the virtues and challenges of Cataloguing Software, also known as Digital Asset Management (DAM). I’ve put on my editing cap for this article, having asked Paul “What is DAM and why should we care about it?” Step into the world of cataloguing and keywords in order to make the most of your digital collection.


The Absolute Best Lens in Nepal

The Absolute Best Lens in Nepal

Very rarely a new lens comes along that changes how I think about photography. This is one of those lenses. My priority for glass is always quality over convenience, but this time you get enough of both to make it a hard one to walk past. The newly released "DG Vario-Summilux 10-25mm f/1.7 ASPH" by Panasonic+Leica is not only a great lens for MFT systems, but may well be good enough to tempt full frame owners to look seriously at adopting the smaller mirrorless format of the Lumix G series.




Ewen's New Book



"ReIMAGINE" is now available to order online.
It's a very big and very generous book that will help you to reconnect with your creative side.


ReIMAGINE







Stay Inspired
Join Ewen's newsletter for short updates on new articles and photographic inspiration.

Thanks, you are now subscribed. Please check your inbox for a welcome email.




Computer says NO.
Please check the email address.