Share This [+]
March 10th, 2010 by Chris G.
It might seem obvious, but those with less data have arguably simpler storage requirements. Perhaps not so obvious is how many different areas can be impacted as data grows. Backup storage, backup duration, restoration, inter server network connectivity, DR plans, system disk I/O performance, and (of course) your primary storage. More often than not, a review of one’s data reveals things that can be archived or deleted. Some examples:
- E-commerce sites might look to save space by archiving older orders or discontinued products.
- Data reporting/analysis systems might be able to archive or delete the raw data that the reports or analysis are built from.
- Those affected by regulatory compliance should understand their requirements for keeping logs or other items “online” versus “available”. In many cases, only the most recent items need to be kept online. Older items can be archived.
Case study
A popular number crunching website used to try and keep everything online. This caused their database to grow until performance suffered and backups became unwieldy. Their database alone topped out at 300GB! INetU was asked to assist with tuning their database and scaling their backups. INetU found that over 75% of the data within the database was not regularly accessed – especially older than 3 months.
Read the full post »
Tags: data management, storage
Share This [+]
January 20th, 2010 by Rich H.
RAID-5 was long hailed as the enterprise-level storage solution and a fit for nearly every application. The truth is, RAID-5 was designed back in the 80’s to save cost without completely sacrificing redundancy. Back then the cost per byte for storage on enterprise-class drives was so expensive that researchers were scrambling for a solution to store more data for less money.
Let’s say you needed 100MB of storage space and disk-level redundancy. Let’s also say, a 20MB SCSI drive cost $1,000.00. Before RAID-5, you’d buy 10 drives, create 5 RAID-1 arrays at 20MB each, and split your data set up to fit across these 5 separate arrays. Not only is this expensive at $10,000.00, but the storage space you require is split across 5 arrays. With RAID-5, 6 20MB disks gave you 100MB of space, and redundancy. That saves $4,000.00 per storage unit implemented! Sure, there were caveats, but with those kinds of savings, nobody was paying attention.
Welcome to the 21st century. The database is king, and everyone wants performance! Unfortunately, one of RAID-5’s biggest caveats is sacrificing performance, and developers and admins are finally starting to notice. Let’s take a look at the 5 biggest caveats of the RAID level most synonymous with enterprise storage for so many years:
- Performance, Performance, Performance! RAID-5 has significant write penalties all the time due to the requirement for parity calculation. Most implementations also suffer poor read performance, even though RAID-5 proponents consider this one of the “strengths” of RAID-5.
- Rebuild times are horrifying slow. Try days instead of hours for large storage arrays due to the need to read, calculate parity and write every disk in the array for each megabyte rebuilt. This can literally translate to days of downtime for a single disk failure depending on the I/O performance required for the storage to be usable.
Read the full post »
Tags: array, controller, database, DB, I/O, performance, RAID, RAID-10, RAID-5, rebuild, spindles, storage