SAN NAS Analyst: January 2010

Saturday, January 30, 2010

RAID Level Pros/Cons

RAID level pros and cons
RAID level	Characteristics	Minimum number of physical drives	Advantages	Disadvantages
0	Uses striping but not redundancy of data; often not considered “true” RAID	2	Provides the best performance because no parity calculation overhead is involved; relatively simple and easy to implement	No fault tolerance; failure of one drive will result in all data in an array being lost
1	Duplicates but does not stripe data; also known as disk mirroring	2	Faster read performance, since both disks can be read at the same time; provides the best fault tolerance, because data is 100 percent redundant	Inefficient high disk overhead compared to other levels of RAID
2	Disk striping with error checking and correcting information stored on one or more disks	Many	Very reliable; faults can be corrected on the fly from stored correcting information	High cost; entire disks must be devoted to correction information storage; not considered commercially viable.
3	Striping with one drive to store drive parity information; embedded error checking (ECC) is used to detect errors	3	High data transfer rates; disk failure has a negligible impact on throughput	Complex controller design best implemented as hardware RAID instead of software RAID
4	Large stripes (data blocks) with one drive to store drive parity information	3	Takes advantage of overlapped I/O for fast read operations; low ratio of parity disks to data disks	No I/O overlapping is possible in write operations, since all such operations have to update the parity drive; complex controller design
5	Stores parity information across all disks in the array; requires at least three and usually five disks for the array	3	Better read performance than mirrored volumes; read and write operations can be overlapped; low ratio of parity disks to data disks	Most complex controller design; more difficult to rebuild in case of disk failure; best for systems in which performance is not critical or that do few write operations
6	Similar to RAID 5 but with a second parity scheme distributed across the drives	3	Extremely high fault tolerance and drive-failure tolerance	Few commercial examples at present
7	Uses a real-time embedded operating system controller, high-speed caching, and a dedicated parity drive	3	Excellent write performance; scalable host interfaces for connectivity or increased transfer bandwidth	Very high cost; only one vendor (Storage Computer Corporation) offers this system at present
10	An array of stripes in which each stripe is a RAID 1 array of drives	4	Higher performance than RAID 1	Much higher cost than RAID 1
53	An array of stripes in which each stripe is a RAID 3 array of disks	5	Better performance than RAID 3	Much higher cost than RAID 3
0+1	A mirrored array of RAID 0 arrays; provides the fault tolerance of RAID 5 and the overhead for fault tolerance of RAID 1 (mirroring)	4	Multiple stripe segments enable high information-transfer rates	A single drive failure will cause the whole array to revert to a RAID 0 array; is also expensive to implement and imposes a high overhead on the system

RAID 0+1 (Mirror of Stripes, RAID 01, or RAID 0 then RAID 1)

Drives required (minimum): 4 (requires an even number of disks)
Max capacity: Number of disks x Disk capacity / 2
Description: RAID 0+1 is a mirror (RAID 1) of a stripe set (RAID 0). For example, suppose you have six hard disks. To create a RAID 0+1 scenario, you would take three of the disks and create a RAID 0 stripe set with a total capacity of three times the size of each disk (number of disks x capacity of disks). Now, to the other three disks, you would mirror the contents of this stripe set.
Pros: A RAID 0+1 set could theoretically withstand the loss of all of the drives in one of the RAID 0 arrays and remain functional since all of the data would be mirrored to the second RAID 1 array. In most cases, the failure of two drives will compromise the array since many RAID controllers will take one of the RAID 0 mirrors offline if one of the disks in the RAID set fails (after all, a RAID 0 array does not provide any kind of redundancy), thus, leaving just the other RAID 0 set active, which has no redundancy. In short, a total array failure requires the loss of a single drive from each RAID 0 set. Provides very good sequential and random read and write performance.
Cons: Requires 50% of the total disk capacity to operate. Not as fault-tolerant as RAID 10. Can withstand loss of only a single drive with most controllers. Scalability is limited and expensive.

RAID 10 (Stripe of Mirrors, RAID 1+0, or RAID 1 then RAID 0)

Drives required (minimum): 4 (requires an even number of disks)
Max capacity: Number of disks x Disk capacity / 2
Description: RAID 10 is a stripe (RAID 0) of multiple mirror sets (RAID 1). Again, suppose you have six hard disks. To create a RAID 10 array, take two of the disks and create a RAID 1 mirror set with a total capacity of one disk in the array. Repeat the same procedure twice for the other four disks. Finally, create a RAID 0 array that houses each of these mirror sets.
Pros: A RAID 10 set can withstand the loss of one disk in every RAID 1 array, but cannot withstand the loss of both disks in one RAID 1 array. As with RAID 0+1, RAID 10 provides very good sequential and random read and write performance. These multilevel RAID arrays can often perform better than their single-digit counterparts due to the ability to read from and write to multiple disks at once.
Cons: Requires 50% of the total disk capacity to operate. Scalability is limited and expensive.

RAID 50 (Stripe of Parity Set, RAID 5+0, or RAID 5 then RAID 0)

Drives required (minimum): 6
Max capacity: (Drives in each RAID 5 set – 1) x Number of RAID 5 sets x Disk capacity
Description: RAID 50 is a stripe (RAID 0) of multiple parity sets (RAID 5). This time, suppose you have twelve hard disks. To create a RAID 50 array, take four of the disks and create a RAID 5 stripe with parity set with a total capacity of three times the size of each disk (remember, in RAID 5, you "lose" one disk's worth of capacity). Repeat the same procedure twice for the other eight disks. Finally, create a RAID 0 array that houses each of these RAID 5 sets.
Pros: A RAID 50 set can withstand the loss of one disk in every RAID 5 array, but cannot withstand the loss of multiple disks in one of the RAID 5 arrays. RAID 50 provides good sequential and random read and write performance. These multilevel RAID arrays can often perform better than their single-digit counterparts due to the ability to read from and write to multiple disks at once.
Cons: RAID 50 is somewhat complex and can be expensive to implement. A rebuild after a drive failure can seriously hamper overall array performance.

"RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. The different schemes/architectures are named by the word RAID followed by a number, as in RAID 0, RAID 1, RAID 5 etc. RAID's various designs involve two key design goals: increase data reliability and/or increase input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is seen by the computer user and operating system as one single disk. RAID can be set up to serve several different purposes.

RAID 0 (Disk striping)

Drives required (minimum): 2
Max capacity: Number of disks x disk capacity
Description: Data to be written to the disk is broken down into blocks with each block written to a separate disk.
Pros: Very, very fast since data is written to and read from storage over multiple "spindles", meaning that the I/O load is distributed. The more disks that are added, the better the performance (in theory). As always, if you’re looking for huge performance gains, use a tool such as IOmeter to test your storage performance as the gains may not be that great.
Cons: When a single drive fails, the entire array can be compromised since this RAID level does not include any safeguards. As disks are added, the risk of failure increases.

RAID 1 (Disk mirroring)

Drives required (minimum): 2 (or multiples of 2)
Max capacity: Total array capacity divided by 2
Description: All data that is written to the storage system is replicated to two physical disks, providing a high level of redundancy.
Pros: Very reliable, assuming only a single disk per pair fails. RAID 1 tends to provide good read performance (equal to or better than a single drive).
Cons: Because each drive is mirrored to another, requires 100% disk overhead to operate. Write performance can sometimes suffer due to the need to write the data to two drives, but is often still better than write performance for other RAID levels.

RAID 2: This RAID level is no longer used.

RAID 3 (Parallel transfer disks with parity)

Drives required (minimum): 3
Max capacity: (Number of disks minus 1) x capacity of each disk
Description: Data is broken down to the byte level and evenly striped across all of the data disks until complete. All parity information is written to a separate, dedicated disk.
Pros: Tolerates the loss of a single drive. Reasonable sequential write performance. Good sequential read performance.
Cons: Rarely used, so troubleshooting information could be sparse. Requires hardware RAID to be truly viable. RAID 3 is generally considered to be very efficient. Poor random write performance. Fair random read performance.

RAID 4 (Independent data disks with shared parity blocks)

Max capacity: (Number of disks minus 1) x capacity of each disk
Description: A file is broken down into blocks and each block is written across multiple disks, but not necessarily evenly. Like RAID 3, RAID 4 uses a separate physical disk to handle parity. Excellent choice for environments in which read rate is critical for heavy transaction volume.
Drives required (minimum): 3
Pros: Very good read rate. Tolerates the loss of a single drive.
Cons: Write performance is poor. Block read performance is okay.

RAID 5 (Independent access array without rotating parity)

Max capacity: (Number of disks - 1) x capacity of each disk
Description: Like RAID 4, blocks of data are written across the entire set of disks (sometimes unevenly), but in this case, the parity information is interspersed with the rest of the data.
Drives required (minimum): 3
Pros: Well supported. Tolerates the loss of a single drive.
Cons: Performance during a rebuild can be quite poor. Write performance is sometimes only fair due to the need to constantly update parity information.

RAID 6 (Independent Data disks with two independent distributed parity schemes)

Max capacity: (Number of disks - 2) x capacity of each disk
Description: Like RAID 4, blocks of data are written across the entire set of disks (sometimes unevenly), but in this case, the parity information is interspersed with the rest of the data.
Drives required (minimum): 3
Pros: Tolerates the loss of up to two drives. Read performance is good. Excellent for absolutely critical applications.
Cons: Write performance is not very good. Write performance is worse than RAID 5 due to the need to update multiple parity sets. Performance can heavily degrade during a rebuild.

Wednesday, January 27, 2010

Fortune "100 Best Companies to Work For" - 2010

Each year, FORTUNE magazine compiles the "100 Best Companies to Work For" list from a pool of eligible U.S.-based applicants.

For more details on this year’s ranking and process, go to http://money.cnn.com/magazines/fortune/bestcompanies/2010/.

Click here to check for 2009

EMC Switch Analysis Tool (SWAT)

The Switch Analysis Tool (SWAT) is a Web-based application that processes the output of native commands from Brocade, Cisco and McDATA switches and performs the following functions:

Displays information about the switch properties, effective configuration, name server entries, port statistics, fabric OS file system, zone checks, environment, memory, licensing, VSAN and some logging checks.
Provides notices and warnings of potential problem areas where appropriate.
Provides recommendations where appropriate.

Visit the Switch Analysis Tool (SWAT) Website

Friday, January 22, 2010

Host Environment Analysis Tool (HEAT)

The Host Environment Analysis Tool (HEAT) is a Web-based application that…

Processes the output of the EMCReports script for Windows 2000 and Windows 2003 hosts and performs the following functions:

Displays information about the host, memory details, IRQ levels, Windows services, network adapters, disk drives, file system alignment, SCSI, drivers, host bus adapters, installed software and hot-fixes, EMC PowerPath and Solutions Enabler, Symmetrix, CLARiiON, Celerra software, device mapping and application and event log checking.
Checks versions of system drivers, HBA drivers and firmware, EMC PowerPath and Solutions Enabler software, volume management software, EMC Disk Array software against the latest versions that are EMC Supported.
Provides notices and warnings of potential problem areas where appropriate.
Provides recommendations where appropriate.

EMC HEAT Process the output of the EMCGrab scripts for AIX, HP-UX, Linux, Tru-64/OSF1, and Solaris hosts and performs the following functions:

Displays information about the host, OS, OS patches, host bus adapters, multipathing, drivers, file systems, installed volume management software, EMC PowerPath and Solutions Enabler software, Symmetrix, Clariion, Celerra software, device mapping and application and event log checking.
Checks versions of system drivers, hba drivers and firmware, EMC PowerPath and Solutions Enabler software, volume management software, EMC Disk Array software against the latest versions that are EMC Supported.
Provides notices and warnings of potential problem areas where appropriate.
Provides recommendations where appropriate.

Visit the Host Environment Analysis Tool (HEAT) Website

Thursday, January 21, 2010

VMware Host Environment Analysis Tool (VMHEAT)

The VMware Host Environment Analysis Tool (VMHEAT) is a Web-based application that…

Process the output of the ESX EMCGrab scripts for ESX 3.x hosts and performs the following functions:

Displays information about the host, OS, OS patches, host bus adapters, drivers, file systems, installed volume management software and Solutions Enabler software, Symmetrix, CLARiiON, Celerra software, device mapping and application and event log checking.
Checks versions of system drivers, hba drivers and firmware, and Solutions Enabler software, volume management software, EMC Disk Array software against the latest versions that are EMC Supported.
Provides notices and warnings of potential problem areas where appropriate.
Provides recommendations where appropriate.

EMC PowerPath Configuration Checker - PPCC

PowerPath Configuration Checker (PPCC) is a software program that verifies that a host is configured to the hardware and software required for PowerPath multipathing features (failover and load-balancing functions, licensing, and policies) as specified in the EMC Support Matrix (ESM). This utility analyzes system configuration information collected via the EMC Reports and EMC Grab utilities. PPCC currently runs on Windows and Linux platforms only.

PPCC works by analyzing data collected by EMC Reports for Windows and Grab for UNIX version 4.x. EMC Reports and EMC Grab are widely used EMC diagnostic utilities that collect host configuration data, system logs, and other host information used for EMC software problem analysis and resolution.

PPCC tests the following check:

* OS version * Machine Architecture as per ESM(EMC Support Matrix)
* Powerpath Version * Powerpath eFix/License * License policy
* I/O timeout * EOL and EOSL ( End of life & End of Service life)
* HBA Model * HBA Driver * HBA Firmware * Kernel Version
* Symmetrix Microcode * Symmetrix Model * Symapi Version
* CLARiiON Fail-Over * CLARiiON Flare * Upgrade_path
* Veritas DMP Version v Powermt custom

* CLARiiON Model * other multipathing software
* Third party array validation * Checks OS patches

Click here to Download PPCC(pdf)

Wednesday, January 20, 2010

EMC doubles density of Clariion, Celerra storage systems

EMC Corp. today announced a new higher-density configuration of its Clariion CX4 midrange storage array and Celerra network-attached storage (NAS) gateway device, offering twice the capacity of previous systems in half the floor space.

The mid-range storage systems, which now can house twice the number of hard disk drives in half the frame space, will also support lower-power 2TB SATA drives, compared to the 1TB drives supported by earlier systems.

The higher-density Clariion can be configured with the 2TB SATA drives as well as high-performance enterprise solid-state drives (SSDs). The Clariion uses power efficiency technology such as disk spin down and EMC fully automated storage tiering (FAST) to allow for automated data migration between internal disks. Disk spin down puts drives into sleep mode when they're not in use, yielding a 65% power savings compared to always-on SATA drives, according to Ruya Atac-Barrett, director of Clariion product marketing at EMC.

A Clariion array can support up to 480 drives, or 960TB of raw capacity. A previous Clariion model array would have taken up six data center floor tiles, while the new model takes up only three.

The higher density is achieved through a re-architecture of the frame, making it five inches deeper and so that two disk drive trays now fit front to back.
Click here to read the original story.

Friday, January 8, 2010

Virtual Clone - Taking your clone Virtually anywhere!

Iomega's new v.Clone software sounds like its the start. Basically it backs up your C: drive into a bootable, standalone app-wrapped VMWare image, which can run off of a compatible Iomega drive (new eGo and Prestige lines, for starters) on any other Windows computer. Any changes you make to your system in VM mode will then be synced back to your main machine when you return.

There could be some performance issues with this initial setup but it all depends on how feasible and promising it will be.

Wednesday, January 6, 2010

Petabye age takenover by Extabye Era

In the storage world of ever growing data capacity - More isn't just more, MORE is different!!!

Getting used to Gigabytes & Terabytes in today's market, we should now be thinking about "Extabyte" capacity. Put 1,024 bytes together and you have a kilobyte(KB). Take the same amount of KB (1,024) and you’ve built a megabyte(MB); 1,024 MB and you have a gigabyte(GB) – and so on to get a terabyte(TB), and finally, 1,024 TB is a petabyte(PB). So how many bytes are in a Extabyte?

That’s a thousand PB or 1,000,000,000,000,000,000 bytes.

NetApp achieved one of this magic moments in the past 52 weeks by shipping a extabyte of storage¹.

Tuesday, January 5, 2010

EMC techbook on Storage systems with VMWare

EMC TechBooks with VMWare environment are targeted manuals written by EMC Engineering and Solutions Architect teams covering integration of EMC products with leading industry technologies such as databases, applications and various operating system environments. These guides supplement existing product documentation. EMC TechBooks can provide real world examples with implementation details, hints and tips and sample code. These guides eliminate some of the risks and tradeoffs by leveraging industry knowledge and best practices thereby creating more successful implementations.

Saturday, January 2, 2010

SAN Vs NAS

SAN(Storage Area Network) and NAS (Network Attached Storage)are differentiated by the protocols that is used to communicate between two or more devices.

Data Transport Protocol:

SAN uses SCSI protocol to transfer data to communicate between devices and often it uses FCP,iSCSI,FoE. The SAN storage device presents RAID volumes as block storage and hosts format their own filesystem on their assigned volume.

NAS is accessed using file server protocols such as NFS/CIFS/HTTP/FTP protocols to transfer data through the network. The NAS storage device mounts a filesystem formatted on its internal RAID volume and exports the storage as directories and folders.

SAN does block-level I/O and NAS does file-level I/O to the storage. SAN addresses the data by logical block numbers, and transfers the data in (raw) disk blocks whereas NAS identifies the data by file name and byte offset, transfers file data or metadata, and handles security, user authentication, file locking etc. When a server is accessing NAS storage it will see as a share(when mapped) & if a server is accessing SAN storage it will show up as an actual drive.

SAN NAS Analyst