Okay, what the hell happened to my RAID?

Discussion in 'OT Technology' started by deusexaethera, Sep 13, 2008.

  1. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    I ran PCMark05 in January 2007 and scored over 5000 points on the hard drive tests. I ran it again, three times, over the past couple of days, and I get about 1450 points. Even my old single disk scored 3000!

    All of the cables are secure, the controller is working fine, the array isn't degraded, the array is only 2/3 full, the partition is healthy...what the hell else could be going wrong?
     
  2. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    No ideas?
     
  3. eideteker

    eideteker Who jarked off in my frakkin' coffee? OT Supporter

    Joined:
    Feb 15, 2006
    Messages:
    2,428
    Likes Received:
    0
    Location:
    PA.
    Nope. Where's old potty when you need him?
     
  4. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    No cache battery to start with, unfortunately.

    I've done some more cockamamey troubleshooting:

    - Uninstalled Avast! antivirus;
    - Uninstalled XP sp3;
    - Degraded and rebuilt the array;
    - Unplugged absolutely everything and plugged it back in;
    - Moved all non-essential data off the RAID onto a FireWire drive;
    - Ran a full surface scan (even though the RAID doesn't have a "surface" to speak of);

    Nothing. No goddamned progress at all. The score still hovers around 1440, give or take ~10 points, which is almost enough to convince me there isn't actually anything wrong to start with; if something were failing I gotta think its performance would fluctuate by more than ±1%.

    Does anyone know how much of a performance hit hard drives take when they have a lot of bad sectors that have been relocated to "spare" sections of the disk? Its my understanding that all disks do this nowadays, instead of making you scan for errors manually.
     
  5. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    Consumer level raid sucks ass when you have actual data? I'm not being a smart ass. Couldn't that be it?
     
  6. Harry Caray

    Harry Caray Fine purveyor of x.264, h.264 & TS HD-Video !!! HD

    Joined:
    Apr 19, 2001
    Messages:
    17,176
    Likes Received:
    5
    Location:
    MyCrews:4x4,SoCal,Tesla,EV's
    All things aside (like "scores" and "points" :rolleyes: ) how about some real world #'s that we use in the field ???

    For example, my 3Ware 9650 in RAID-6 with 12 1tb ES2 drives is sitting at about 520mb-730mb sec....

    Try that and come back... but as for your card, usually being in degraded mode, bad stripe write mode, do you have "auto carving" or write-hole flushing turned on or off ?

    What type of R-level is this? controller? bus type? etc,etc...
     
  7. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    In theory, yes, but the controller I have has a dedicated XOR unit and 64MB of cache. And in any event, a cheap controller wouldn't explain why the performance would drop so dramatically when the quantity of actual data on the array hasn't increased substantially in a couple of years, or why it would drop below the performance of the old single drive it replaced, which was smaller and slower in addition to being older.
     
  8. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    Fortunately, PCMark's score is based on actual data transfer speeds. See below, pulled from The ORB:

    Code:
    2007-01-06
    ----------
    XP Startup:           8.17MB/s
    Application Loading:  7.32MB/s
    General Usage:        7.04MB/s
    Virus Scan:          70.33MB/s
    File Write:          83.58MB/s
    
    2008-09-12
    ----------
    XP Startup:           2.85MB/s
    Application Loading:  2.81MB/s
    General Usage:        2.03MB/s
    Virus Scan:          71.75MB/s
    File Write:           2.15MB/s
    It's a 3-disk RAID3 using WD Raptors, plugged into a NetCell SATA controller. It's all hardware-driven, no processes inserted into the CPU. It's not the best RAID controller ever, but the price was good and it worked fine when I got it. I suppose it's possible the controller is failing, but I'm not getting any data errors -- or rather, none that the card and/or the monitoring software see fit to tell me about.

    I have no idea what bad stripe write mode, auto-carving, or write-hole flushing mean. :(
     
  9. Harry Caray

    Harry Caray Fine purveyor of x.264, h.264 & TS HD-Video !!! HD

    Joined:
    Apr 19, 2001
    Messages:
    17,176
    Likes Received:
    5
    Location:
    MyCrews:4x4,SoCal,Tesla,EV's
    those are bad #'s.... usually means that the RAID card is in "degraded" or safe mode.. you sure a drive isn't out?

    Have you gone into the BIOS ? or is this a SW / mobo raid setup?

    EDIT: NM, see that above...
     
  10. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    I'm positive it's not degraded -- it degraded once during a power outage and when the computer came back online, the monitoring software instantly panicked and demanded attention. I popped the drive out of its hot-swap bay, waited for it to spin down, and plugged it back in, and it rebuilt itself in an hour.

    I even decided to go out on a limb and yank a drive while the computer was running last night, wait for the controller to panic, then plug the drive back in and let it rebuild itself. I figured at least that way, if one of the drives was already dead and the controller was lying to me, the computer would crash within seconds. No catastrophe occurred, and the array again rebuilt itself in an hour or so.

    The controller's BIOS reports no errors. The only thing left I can think of is to pull the drives and scan them individually with some kind of error-checking software, but I don't have any other machines that can handle SATA.

    EDIT: Safe mode?
     
    Last edited: Sep 16, 2008
  11. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    I believe by 'safe mode' he meant one drive was out.

    Never even heard of anyone using RAID 3, but based upon this (google hit #2) it seems that as you get more data the performance of your array would decrease:

     
  12. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    RAID3 favors reading data vs. writing data, whereas RAID5 is more balanced between the two. I'm not sure I believe that the quantity of data on the array would affect I/O speed, because I could have 50,000TB of data on an array and the controller would still only be reading one stripe at a time, in the order the requests came in (disregarding TCQ/NCQ for the moment, since I don't have it).
     
  13. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    RAID3... lollers.
     
  14. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    More data on disk = slower reads on the parity check, yes? Especially if your raid card is arranging the data all retarded. Which I suspect a cheap one might.

    RAID 3? Admit it, you were just being a smartass when you chose that.
     
  15. EvilSS

    EvilSS New Member

    Joined:
    Jun 11, 2003
    Messages:
    5,104
    Likes Received:
    0
    Location:
    STL
    It's that funky Netcell he has, it does some sort of modified RAID 3. To bad their company finances weren't redundant also....
     
  16. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    RAID IS FOR SUCKERS.
     
  17. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    I'm not interested in debating NetCell's approach or its finances. It works the same as any other RAID, except that instead of having a stripe size of 64kB or somesuch, it has a stripe size of 1B; I'll let you speculate on the advantages and disadvantages of that.

    What I am concerned with is that my RAID, which has functioned admirably for two years, has suddenly slowed to a crawl -- or else something in Windows is causing PCMark to think it's slowed to a crawl. I'm not really sure how else to test the RAID's performance without using some kind of benchmark, but I've tried everything else I can think of that could be interfering with its operation.

    Does anyone else have any other ideas?
     
    Last edited: Sep 17, 2008
  18. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    You might want to get interested, because the Netcel/RAID 3 is quite possibly your problem.
     
  19. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    yeah. sounds like he wanted to prove something by using RAID3 when all he proved is he is good at making bad decisions. as far as i'm concerned there is RAID1, RAID1+0, RAID5 and RAID6 (though i haven't really used this at all yet). RAID3 and RAID4 are retarded and shouldn't even be bothered with unless you have enough cache to offset the fact that you're going to easily overload that parity drive with a decent amount of writes.
     
  20. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    Well, he's reading. But doesn't it hit the parity drive to test reads too?
     
  21. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    hmm... i guess it does have to verify parity on reads as well.
     
  22. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    I would bet that... XOR'ing all the data one byte at a time would get slow if it isn't smart about how it writes parity data to disk.

    Thats my guess.
     
  23. 1999TL

    1999TL New Member

    Joined:
    Sep 22, 2006
    Messages:
    484
    Likes Received:
    0
    Location:
    Lubbock, TX
    you guys are missing the point. He can say he has raid 3 and you don't.
     
  24. deusexaethera

    deusexaethera OT Supporter

    Joined:
    Jan 27, 2005
    Messages:
    19,712
    Likes Received:
    0
    I didn't buy it to prove anything. I bought it because it's full-hardware RAID at a quarter the price of the next-higher-priced full-hardware RAID card, and because it doesn't require drivers to boot from. I was intrigued that it was a RAID3, but frankly, at the time I bought it, I had no idea what the difference was between RAID3 and RAID5.

    It may well be the RAID controller is on the outs, but I ran PCMark a few months ago when I installed my FLASH pagefile disk, and the results came out the same. (I wish I'd saved those on the ORB, I dunno why I didn't.) The only thing I can think of is a lot of data (dozens of GBs) has moved onto and off of the array since then, leaving the overall load roughly the same in the end, but who knows what it did to the file allocation table. Still...maybe that has an effect, I don't know.

    Anyway, since it has a 1B stripe size, there is no circumstance in which the controller is pulling data from fewer than all of the disks, or writing data to fewer than all of the disks. This means there is never a file too small to benefit from having multiple disks to be stored on. It might be a RAID3 according to the distribution of parity data, but the parity disk isn't the bottleneck because it's always reading from/writing to all the disks all the time anyway.
     
    Last edited: Sep 17, 2008
  25. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    who knows what your problem is at this point. you're doing weird things with your PC, so you're going to have fun untangling that mess.
     

Share This Page