r4 - 31 Oct 2006 - 20:53:12 - ThomasLimoncelliYou are here: TWiki >  TPOSANA2 Web  > DataStorageChapter
NO MORE UPDATES TO THIS PAGE PLEASE. SUBMIT ALL FUTURE COMMENTS TO tposana-bugs_at_wingfoot_org.gif

Data Storage Chapter

New Chapter General Advice

Please indicate what you would expect to see covered in this chapter. We will integrate your comments into the chapter writing process. As new chapters become ready for draft review and outside review, we will post that information here. If you are interested in reviewing a chapter, please be sure to leave comments here letting us know what think should be in the chapter.

-- StrataChalup - 17 Aug 2006

This chapter is ready for draft review!

-- StrataChalup - 17 Aug 2006

If you wish, I can take a look at it and put in some comments.

-- StephenHarris - 29 Aug 2006

Hello Strata & Tom,

Sorry about the delay. I put these remarks together tonight and thought I would feed them back to you both "as is" rather than polishing them up too much. I understand that time is of the essence. The comments were written as I read the chapter and as such, I have provided clarification in later comments where I found my previous comments to be somewhat inaccurate (that is, saying something is not discussed when in fact it is discussed later in the chapter). There is a mix of specific comments with page numbers and general comments about the text. I hope you find them useful.

If you would like any further clarification on any of the comments, please E-mail me. I would be more than happy to re-review the chapter if you have made any changes since sending it to me. Furthermore, I have plenty of examples that highlight some of the concepts you speak about in the chapter, if you need them.

Kind Regards,

Nathan Dietsch

RAID

Page 559, I would put in an explanation describing the difference between a RAID 0 Stripe and a RAID 0 concat. In fact, the word concatenates is used in a confusing manner in the first sentence.

Page 561, paragraph 4, "but a single hot space" should read "but a single hot spare"

Page 562, I would add a reference which backs up the statement that RAID-3 is faster for multimedia applications. As an introductory article, you should either explain why it is faster in the paragraph or provide a reference which explains more about why RAID-3 is faster in that particular scenario. Same for RAID-4 and WAFL.

Page 563, RAID 5. You do not differentiate between a Full-stripe write and a read-modify-write operation. This changes the performance dramatically. The EMC DMX-3 provides a good example of doinging RAID-5 intelligently. Furthermore, the choice of hardware RAID-5 versus software RAID-5 should be discussed.

Page 565, you allude to RAID0+1 (as opposed to RAID 10), but do not mention it specifically.

Page 565 RAID NOT A BACKUP STRATEGY: Examples are great for this sort of stuff.

Page 565: You describe RAID after describing the RAID-levels, I believe there may have been a formatting mixup here.

Page 566&7: You repeat the RAID is not a backup strategy section

Page 568&9: This section should be merged with the description of the various RAID-levels. I mentioned that RAID-3 lacked a description of why it was faster for multimedia applications, the description clears it up perfectly. You should merge them as there is a lot of redundant information between the two section

NAS

Page 572, paragraph 5, the word servers is used twice in a row "servers servers shared" should be "servers shared"

General comment: There is no mention of NDMP when discussing NAS backups.

SANs & iSCSI

General comments:

I would personally like to see a differentiation between SCSI, the interconnect and SCSI, the command set. SCSI over FC and SCSI over IP are great examples of that. Furthermore, providing references to more detail would be good as well.

While this could also be applicable to DAS, there is no mention of how Storage arrays create LUNs out of physical disks and present them as "disks". Providing an Example of how EMC, Hitachi or pick a vendor do this would be useful.

I also think there should be a compare and constrast section on the various network storage technologies such as NAS, SAN and iSCSI. Which type of storage is appropriate in different situations and why. Discussing how network storage fits in with HA environments would also be good.

Filesystems

General comments: You talk about how filesystem caching is useful, but do not discuss the case where it gets in the way when using a RDBMS such as Oracle that implements it own caching.

Steve Pate's UNIX Filesystems book is an excellent reference worth pointing readers towards

Journaled filesystems: One of the main benefits of a journaled filesystem is avoid a fsck after an outage of some description. Example: Powering off a Solaris system without shutting it down previously scared a lot of administrators, now if my home Solaris workstation locks up (damn PC), I power-cycle it and do not have to worry. UFS with logging is a wonderful thing. Note: I still shutdown servers at work properly, they are a whole other story.

Problem and Recovery Concepts

Drive errors: I would put in a reference to the server monitoring chapter. Also, Solaris has implemented algorithms in the self-healing portions of Solaris 10 that detect patterns in drive errors and translate this into useful information messages for administrators.

Filesystem recovery: See comments on Journaled filesystems.

Summary: You allude to the composition of LUNs in a storage array, but more detail on this would be nice, perhaps not in the summary though.

Icing

Page 592: Recycling old drives, "disks for write-intensive applications such as ..." there is a word missing there.

Page 593: Applying mirroring for recoverability: You say "the system is not run in true RAID 0 mirror mode". I think you meant to say RAID-1. Veritas have implemented an enhancement to their mirroring software (dirty region logging) which allows you to only copy over the changes since the plex was detached. The alternative is that you need to resilver the entire mirror. EMC implement this as Business Continuance Volumes. I would be happy to provide examples of how we use this at work.

Page 596: Timeouts: It is worth discussing how the various components of the IO stack each have their own timeouts, which when combined together with retries can result in a long-time between a failure actually occuring and it being reported up to the application-level. Example: At work we have an EMC Symmetrix, that Symmetrix could suffer a disk problem, the disk may try a number of times before reporting an error up the front-end director, the HBA talking to the front-end director may timeout and retry a number of time, this would result in the error being passed up to the OS where it retries the IO, the OS would report the error to the multi-pathing software (DMP and/or PowerPath?) which could retry the IO, the multi-pathing software would the error to the volume management layer that would retry before passing it to the filesystem that would retry which passes the error back to the application which could also retry. Adding all of these retries together means that the application (a database in this case) could stall on a write until the error passes all the way back up the stack.

Saturation: You should also explain that occasionally saturation should be expected and is not neccessarily a bad thing, but rather "a cost of doing business". Examples would be backups which do large sequential IOs and will suck data off the disk as fast as they can. Large (unavoidable) full table scans in databases can be equally unavoidable. Much like batch-jobs for CPU, saturation can be a normal part of how a system works. This is especially the case with off-hosted tasks such as backups or feeds from copies of OLTP databases into data warehouses.

Evaluating new storage systems: You mention cheaper solutions cheating by using your systems CPU and memory. A prime example of this is software RAID-5 versus hardware RAID-5. Software RAID-5 is an application-killer rather than a killer-application whereas hardware RAID-5, especially when implemented well, is really quite useful.

CDP: Some examples would be good; SRDF from EMC, TruCopy? from Hitachi, Veritas Volume Replicator from Veritas, ZFS send/receive from Sun, Standby database (log shipping) from Oracle.

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r4 < r3 < r2 < r1 | More topic actions
key Log In or Register
Log In or Register

Main Web Users Groups Index Search Changes Notifications Statistics Preferences


Webs AprilFoolsRFCs? EduResources? Main Sandbox TM2SA TPOSANA2 TWiki Log In or Register

Main Web Users Groups Index Search Changes Notifications Statistics Preferences


Webs AprilFoolsRFCs? EduResources? Main Sandbox TM2SA TPOSANA2 TWiki porn free porn


 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback