Glenn threw a real spanner in the works just before I went off to buy this damned USB Hard Drive as a backup device!
We both bought external hard disks around the same time, and would probably both do it differently if we were to do it again!
Original photo by Topato
So where do we start?
Glenn’s advice was to get a cheap desktop (circa £200) and stick 4 HDD’s in it and RAID-5 them, this got me thinking – I have my old desktop and I’ve promised Emma I’ll buy a new desktop (that doesn’t have wires hanging out and is in a terrible half working condition) when we move.
Here’s the problem – I’ve not built a PC in years, and I know NOTHING about RAID.
Glenn sent me this article, which discusses RAID in some detail and how to achieve it under Windows. To be fair, I really didn’t want to looking at yet another Windows license.
Expensive “Out of the Box”
So I started looking around at the various “out of the box” solutions that would just require me to populate them with hard drives. Ultimate Storage seem to be the people for this, but the prices are high when you consider I’m likely to be buying four hard disks to put in them! It would be very easy to price yourself out of the market and into a high end solution with this project.
Another Requirement – Futureproofing
That’s another point, looking at hard disk prices, 1TB disk drives are supremely expensive, but that won’t always be the case – I’d be better off with 4 x 500 GB and then upgrading later, so whatever solution I go for should support that potential need for more space, and upgrading the existing space I have.
A Little Project is Born!
The’s when I stumbled across Tom’s Hardware and his Cheap, Fast, DIY RAID 5 NAS box…. This looks like a cunning plan, and we’ve found a card not dissimilar for just over £100 (the 8 port version is probably more aimed at the corporate/server world and hence it’s price hike!). 4 x 500GB Hard disks come in at around £70 including VAT. So providing my old desktop and case can take the pain – I might be able to do this for circa £400.
Just Backup Space?
I know the question most people are asking if you RAID 5 these 4 500GB disks, and you’ve got 1.5TB of data, are you really going to use it as backup space?
The answer is no, it’s likely to become a file server, a huge one at that! but with the four disks in, if one of them goes wrong I can quite easily replace the drive (using Tom’s “sane” method) and be up and running again with no data loss. Why would I then need a backup device?
Offsite backups have to be the next question I guess (but at the bare minimum I would expect to leave a DVD drive in the file server so I can backup data and take the CD’s elsewhere?). I’ll discuss that in another post.
Interesting – I would go for the server route myself also (I am thinking SBS2003 route, as I want to run Exchange, on a mirrored 500GB set-up with an external 500 GB usb drive as a nightly backup via Veritas (now Symantec) Backup Exec.
Personally I would go for hardware mirroring (via the RAID card) over the Software / Windows RAID option (as per Windows Software RAID Guide link I just read).
Hardware fault toleration will give you the ability to RAID 1 or 5 (or 6 or 10!) the O/S drive (yes, even Window) as well as the Data drive (easy to set up on HP servers which I build RAID 1 O/S and RAID 5 Data, quite often with a hot-swap spare, but I have also done XP desktops for gamers via in-build on-board RAID and RAID cards).
Also, your question (and answer) about 1.5TB of data – you’ll right, you always need more space than your storage needs, and as a general rule of thumb, you’ll want to keep disks running at no more than 50% capacity, so that’s 750GB of what I would deem “really useful space” and then use the next half as “space I can use that now affects the performance of my system” – but to be fair, you don’t start noticing much until you get to about 70-80% disk capacity, certainly don’t go past this!
Certainly backup & data-recovery (disaster recovery / business continuity) is a very hot topic, and up until 9/11 wasn’t invested in enough as total of IT spend, these days fault tolerance, DR & BC are more heavily invested in.
An interesting point to note, you may have what you deem ‘your’ best backup solution for your home / business (to suit budget, speed of restore, capacity, etc) – have you ever tested it? Backup is only 50% of the total solution, the RESTORE is the other half, and often 80% of the total amount of work! Once you build your RAID and you aren’t running “live” pull the power from one of the disks (turn the PC off first!) and hopefully you will see your redundant disk kick in (via the RAID manager) – satisfy yourself that you can get data on and off those disks, because at 2am in the morning when something dies and you really want that file you know that you have a higher chance of success!
I’m sure you can see that these ‘hot-topics’ can be talked about all day – so will be interesting to hear other peoples views…and also your thoughts and ideas for your on-site / off-site backup system which will follow…
Darren
Hi Darren,
I’m thinking the RAID 5 rather than the mirrored route will give me the chance to not need another USB hard drive to backup all that data.
Not sure I need the Hot-swap – thoughts?
The problem is testability, as you say we should test the restore – and that is something I intend to do. Although without the hot swap (can I do that with just four disks?) I’m not sure how well it will work – all I know is it’s got to be a hell of a lot better than my existing solution!
Hi Keiron,
Forgive me, you won’t need “hot-swap” – this is where the drive-bay is openly accessable (i.e. live server environment) and you can pull a ‘dead’ disk out and replace it with a new one while the system is running live – needed for enterprise class solutions.
For home, 4 x 500GB NHP (not-hot-plug) SATA/SATA2 disks is fine, RAID 5 (Hardware RAID 5 though the RAID CARD), so that’s 3×500 GB for Data (1.5TB) and the 4th will be the spare.
To test, copy a load of files / photos (make sure you have backed them all up previously) onto the RAID, make sure you can access them OK, then turn the PC off, remove power from one of the drives (this will hopefully signal to the RAID card a drive failure), then watch your PC struggle for a while as it re-builds the spare disk to complete your RAID again – then once completed (the RAID card software RAID manager should tell you when), test you can fully acccess the data and job done – your first live RAID recovery.
As it’s not hot-plug (so you can’t see any nice HDD lights to indicate if a failure occurs or not), I’d be interested to see if the RAID card comes with a RAID manager and lets you know that a disk has died and it’s re-building!
Some people run RAID 5 with 3 disks – so that’s no redundancy (risky and pointless), most with 4, so thats for data and 1 spare, which gives you the ability to withstand 1 disk failure, some people even run 3 live data and up to the same again in ‘spares’ so recover from multiple disk failures – once again data centre stuff, so with what you want to run, 4x500BG will be fine!
I have seen companies who employe vast amounts of people and hold a lot of critical data who operate on a lot less redundancy than this…you would be very surprised that often, simplying backing up to a USB key is more than some compaines do in a lifetime!
When you get the bits – lets have a play!
Darren
Hi
Not sure if I have mis-read Darren’s post, but he seems to say that 4 x 500Gb deives will give you 1.5Tb storage plus a spare.
Not sure why a spare would be needed in a Raid 5 array, after all, part of the purpose of 5 is to stripe the data across 3 drives so if any one drive fails, it can be swapped out and the Raid array will automatically rebuild the missing data.
I also reckon that Darren is wrong on his maths, a 3 x 500Gb array (plus the spare drive) will actually give 1Tb of storage whilst if the 4 drives are included in the array, that’s when you will get your 1.5Tb.
It is worth paying a little attention to MBT (mean time between failure). Some people get this wrong (in the worng direction) so if a drive quoted 100,000 hours MBT this means that it will fail once somewhere in 100,000 of use. With 4 drives, the MTBF is 4 in 100,000 or 1 in 25,000 hours
Regards
Andy
Hi Andy,
You are partially correct, I seem to have written a hybrid of two systems here and confused myself!
As we know, RAID 5 is a min of 3 disks, so that would be 3×500 – 500 = 1TB of useable data space – this option I feel is good for servers which are openly accessible with Hot-Swap spare drive-bays, such as HP’s ML350 / DL380 servers – because when a drive fails, the RAID manager will a) tell you, b) the little light on the disk which is viewable will turn orange / red – indicating a fault and your RAID is at risk and requires attention (i.e. a new disk).
As Keiron is going to run effectively a hybid PC – he’s going to go for internal, Non-Hot-Plug (NHP) drives, so potentially could be blind to any disk failure, so one failure would be fine, two failures equals disaster, this is where the hot-space comes into play, so 3×500 live in his RAID 5 (1.0TB of useable space was you rightly pointed out) and the 4th drive will come into play should a disk fail.
– think of the situation, the systems in, been fine, you’ve been running it for 18 months, you go on holiday for 2 weeks, day one a disk goes…that’s OK…so far, just need to wait for a replacement disk…then, opps, a week later a second disk goes…owch – rare I admit, but happens, ususally when systems are forgotten about and not administered.
This is personally why I like and I spec systems with Hot-Swap Spares (yep, lots of them if the customer can afford it, on the mirroed O/S and the RAID 5 data), especially if they need HA and are sitting in data centres, or even it its an SBS system remotely managaged for an office for 5 or 25 people.
Like all disaster recovery – you want to spec the system for how much data can you afford to loose Vs cost.
You are very right to point our the correct definition of MTBF, I’ve seen many a drive go faulty – end of the day they are a moving part based on old technology, and are suseptable to failures.
Cost of disk is quite low at the moment, so if you can afford it, fill it up with disk!
(and memory!)
Darren
Why does 1TB suddenly sounds less appealing than 1.5TB when just a few years ago a 100GB disk was considered big!??!
I beleive Bill Gates famouly once said:
“640K ought to be enough for anybody”
(err the Vista image fits onto a DVD?)
lol,
So are you guys telling me that I should be going for just 3 drives and ~1.0 TB of storage, and a fourth one in there in case one fails?
Can’t I go for four, and have it notify me when one fails (I assume it does somehow?) so I can just go and buy another to let it rebuild.
We’re not talking about me using this file server day in day out – it will contain some important data, but probably nothing I couldn’t wait the 24 hours to buy another HDD?
If you’re happy with the redundancy level, then buy 4 x 500 GB drives, RAID 5 accross the whole lot giving you 1.5TB of data – the raid manager software which comes with the hardware RAID card “should” tell you when a drive fails – this certainly is the case with HP RAID cards and RAID managers.
I do stress to use a hardware RAID card with RAID manager software – Windows is usually crap at doing a hardware job as you know!
I have been doing a lot of reading, since I was considering adding an NAS solution for my home network. My data consists mainly of videos (TVs and movies) and pictures (many many years worth).
Anyways, out of the box solutions seemed a bit too pricy and the RAID not that spectacular unless you’re willing to spend, so I began looking at building my own fileserver, with a hardware/software RAID solution. That was a bit better bang for the buck, but I still had one nagging concern.
I’ve played around with RAID before, and I realized that with mirroring (the only RAID option I was really considering), was that it relied on the RAID controller. I couldn’t just take a hard drive, remove it physically from the array, and have my information accessible when plugging it into another computer. What happens in a few years if your RAID controller dies and you can’t find the exact same one? Your array will always be dependent on that controller and I really don’t like that feeling. I’d rather have the option of taking a drive, plugging it in another computer, rather than needing to move the whole array (RAID, NAS, DIY file server) around. That means quicker access to my information or the ability to take it with me anywhere I go, on a moment’s notice.
The least costly solution I have come up with, for data that doesn’t change all that much, is to have two huge drives (1 TB) on a computer, either one or both connected via eSATA. Just remember to ghost/copy the main drive once in a while, and keep the ‘backup’ drive detached (preferably located in a fire-proof safe) and back it up once in a while, on a regular basis).
I don’t know where I can post my thoughts, and your site seems to be quite a recent post concerning backing up, NAS and RAID, so I thought I’d share my thoughts here and get some feedback at the same time.
Cheers,
Tim
Hi Tim,
Thanks for dropping by, and apologies for me taking so long to moderate your comment through – I didn’t want to let it through until I had time to reply properly!
As soon as I started looking at RAID 5 out of the box solutions I was in the enterprise arena, and really didn’t fancy the prices!!
I see your point, on the RAID controller – hadn’t really considered that, and figured I’d probably have to keep it current over time!
That’s an interesting solution (and very simple too), the fact that you disconnect the other drive means that you never would suffer a power surge blowing all your drives out. I have a box I could do this already with, but my big concern about it is the Netgear SC101 uses a proprietary file system. I could do with an external drive that will support two disks upto a TB.
Feel free to post your thoughts here anytime, I needed all the input I could get on this, and not finding any resulted in me just spouting what I had found out to the world on this blog!
Hi All,
Funny enough I’ve just been looking at something similar and have just implemented for a customer the following:
NAS solution – a Freecom Network Drive Pro 500GB (can also get as 1TB)
http://www.freecom.com/ecproduct_detail.asp?ID=3705&CatID=8020&sCatID=1146266&ssCatID=1147774
The plug into this a USB HDD, such as the Freecom 250GB / 500Gb HDD to back the data off:
http://www.freecom.com/ecproduct_detail.asp?ID=3400&CatID=8020&sCatID=1146266&ssCatID=1147446
Thoughts?
Darren
Interesting again, I’d also be possibly tempted by having a USB HDD Enclosure and being able to slot different HDD’s into it. Having one at home and one say at a friends house (in the absence of the fire-proof safe obviously!)
Or set up an IP-SEC VPN tunnel between the two sites…
Overkill maybe, I’d rather both disks were unplugged?
I wouldn’t worry about that too much, I’ve recently taken on a new engineer who specialises in data recovery from disks – as in he can re-build hdd’s to get the data off – if you ever get stuck!
But off-site is definately something to be considered, even if you run a 2 week rotation, one disk on your site backing up the NAS one week, then rotate it for the other the next….
The solution mentioned above was for a couple of user network at a Farm, simple but effective.
All look like viable solutions although as a network design and installation tech I’d just like to add a few words.
Carry out a simple cost/benefit analysis to work out what the risks are, what the costs are and which solution offers the best mix of minimising risk against affordable cost.
With regards to RAID, I always thought that the way data was laid down was to a standard, not in a manner proprietary to the RAID card deployed (yes a hardware RAID solution will ALWAYS outperform a software solution). However, I think that my thoughts may have been tempered by the fact that I never had a controller fail on me and I always bought from the same manufacturer (Adaptec) who were the market leaders a few years ago (when competition was lower) and always offered backwards compatibility and a wide range of SCSI disk connectivity options.
With regards to 3 Vs 4 disk, online spares and hot swaps, my experience is from the enterprise background where down time was costly. However, in terms of reliability I found hard drives to be the least reliable component in a server (fast SCSI drives ran hot) followed by cooling fans, processers (the two kind of go together like a “horse and carriageâ€) and power supplies so my “typical†server would have a redundant power supply ready to kick in should the primary fail and at least a 4 disk RAID array although hard drives were smaller so a 5 or 6 disk array was more common with one of the disks being an online spare so in the event of one drive failing the online spare would automatically kick-in, the array would self-rebuild and the server monitoring software would send the sysadmin a notification. Then SMART (Self Monitoring and Analysis Reporting Technology) came in and the server was then able to actively monitor the health of the disks and give advance warning of disk failure, allowing preventative maintenance to take over., the sysadmin receiving warning of failure and then swapping out the disk before failure.
However, it’s worth noting that because these were enterprise (or larger small business systems that we always had a tape streamer and enforced a rigorous tape back-up discipline as well. (Belt and braces really)
For a home set up, there are performance benefits of some types of RAID over others, however if it is just to provide data storage and resilience (RAID 5 was never really a back-up solution just a way of joining many drives together to overcome the small disk size of the time) I’d consider RAID 5, look for a controller with SMART and go for 4 drives configured in a 4 drive array. Then, if you need more storage, you can simply add another drive to the array (provided you have the physical space) and allow the array to rebuild itself and include the new drive so a 4 x 500 array would offer 1.5Tb of storage, adding another 500Gb drive would increase that to 2Tb.
Sorry for going on but I hope this makes sense.
Hi Andy,
Feel free to go on all you like, this makes a lot of sense. The RAID 5 was always my first temptation, but it quickly became apparent that an out of the box solution was almost entirely enterprise (there’s definitely a market there to create a home solution for people!).
I could build a 4 disk RAID 5 array quite cheaply, but once I started looking at getting an 8 disk RAID 5 card, the prices shot up. I know Darren has some views on RAID 5 – I just like the theory of being able to rebuild my data.
You make a good point about the cost/benefit ratio, when I started looking at this I found I could almost upsell myself into any solution. But I’ve held back to work out exactly what’s right for me!
As a side note Andy, I think we’ve virtually just become neighbours! lol
Has anyone considered just hand-transcribing everything or hiding your computer in the shed? 🙂
{looks over shoulder in surprise} “Virtual Neighbours??” and I don’t even watch Aussie soaps.
Of course, if you are just looking for a little online storage, you could always look at http://www.adrive.com who’ll give (yes GIVE) you 50Gb online storage for free.
Not sure what their business model is so not sure on their longevity but it’s a lot of online space for nothing and well worth a look at, IMHO
Well not so much virtual, if the address on your website is anything to go buy we’re at least living on the same estate!
It’s the longevity and the security of those services which puts me off 🙁
Ha, address on the site is current, it’s a small world really – I stumbled across your site when doing some client work relating to back-up solutions and who’d a thought it, for all the interwebs in all the towns you have to live in tis one. 🙂
I see you work for Motorola, I wasn’t sure whether they still had people working up there, at least you’re pretty close to work.
How do you find your BB connection, I had a real mare last year – went down for a hellishly long time as BT p++sed around……if you have any problems, I sympathise.