Advanced: VCS clustering using Solaris Zones

Discussion in 'OT Technology' started by trouphaz, Apr 28, 2008.

  1. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    So, I'm doing some research on VCS clustering using Solaris Zones/Containers. Symantec's recommendation is that you keep your zone local to each server in the cluster instead of putting it on shared storage and then allowing VCS to start the appropriate zone and keep control of the IP in VCS's hands (instead of leaving it in the zone configuration). So, both servers in a 2-node cluster will have identical files including full local copies of the Zone. When you want to start zone 1, VCS will boot the zone and then start a cluster controlled IP on that zone.

    So, what I'm trying to do that I have yet to find info supporting or denying is have application failover as well. I'd like to install my app on shared storage and have VCS not only control the starting and stopping of the Zone, but also the failover of an application on that zone. This way, I can keep the application up while I patch or do other maintenance on the offline node and make sure that the zone is being patched properly as well.

    EDIT: I forgot this part. :)
    Does this make sense? We are trying to consolidate a number of different applications which each run on their own machine onto a cluster running Zones so they still appear to have their own environment.
     
  2. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    I'm not all that familiar with Solaris Zones as I've never had a need to manage Solaris machines before, but what you want to do is definitely within the standard realm of VCS and in fact we're doing some things very similar, in that we have VCS managing resources that are both shared on the SAN, as well as local on each machine. As long as you set your resources up correctly and link them correctly then VCS doesn't care if they're shared or not.
     
  3. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    Zones are a weird sort of pseudo virtualized environment somewhere between a normal OS and something like VMWare. So, you'll have your global zone which is the main OS. Then, you'll have a guest OS which runs on top of it. Now, VMWare virtualizes the entire environment and the guest OS is running a full copy of its own OS including its own kernel. With Zones, you are sharing the kernel with the main global zone, but each local zone appears to be its own full system.

    So, let's say you have your OS with its traditional filesystems. You'll then have some directory where you house your zone. Let's say you create a directory called /zone1. When you create the zone, Solaris will copy a certain amount of packages over and create subdirectories. Then, when you boot this zone, it'll appear as its own server, but will actually remap some of the standard directories like /usr and stuff over making it look like you have a full OS there.

    Anyway, in my example above, if you mount a filesystem on the /zone1/root/app directory from the global zone, it'll show up within the local zone as /app, but it no longer shows up in the global one at all. This is what I'm afraid will cause VCS confusion. It can mount and unmount as well as do commands like "fuser -c /zone/root/app" to see what is running on that mount (the global zone can actually see processes and everything in the local zones, but not vice versa). But, how will the VCS monitor work if it can't see the mount?
     
  4. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    Well the monitor script can work any way you want it. If you expect /somefilesystem/directory1 to be gone when its mounted inside of a zone, you can tell your monitor script to look for that directory and if its not there you know the zone is running (obviously its not that simple though).

    We use quite a lot of custom monitor scripts with our VCS implementations, the primary reason being that we do not want VCS to fail a service group over because an app may have failed. Alot of our clustered apps are controlled by users that will take them offline and bring them online for various reasons out of our control, so a lot of our service groups do something like the following:

    when the servicegroup comes online, mount up some filesystems
    create a .cluster_monitor_do_not_delete file with the unique kernel boot id in it when bringing the filesystems online
    the monitor script(s) for the app(s) running as part of the servicegroup simply look for the file, then if it exists they compare the unique boot id in it to the current kernel boot id - if it matches, it means the filesystems are mounted properly and not failed in any weird way.

    I don't think that scenario really applies to you since it sounds like you really want to actually monitor the zones and make sure they're running properly, but the fact that you can custom script your monitors in any way means (mostly) infinite possibilities so I'm sure you can do what you're looking for.
     
  5. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    i guess i can modify the monitor to look inside the zone itself for the mount. i guess i never really considered modifying the monitor or creating my own. anyway, i created service group for the zone today and i'll allocate some storage tomorrow for testing.


    hopefully, i can get it working very well because my management wants to use Solaris LDOMs which are cool in theory, but kind of painful to consider for a production environment since they just came out. LDOMs (logical domains) are sort of similar to HP-UX vPars or VMWare, but for specific Sun servers (they have to be running CoolThreads processors) which have virtualization hooks built into the hardware. it is cool, but they are sort of limited right now.
     
  6. Peyomp

    Peyomp New Member

    Joined:
    Jan 11, 2002
    Messages:
    14,017
    Likes Received:
    0
    I ran a zone on solaris once. But I can't help you. :(
     
  7. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    We're doing some research and development with Xen Virtualization on SLES, we're actually setting up server farms, but we're doing it without VCS. We're actually using a custom farm framework using a SAN backend that lets us move virtual machines between servers with minimal impact and allows automatic failover should a server die, as each server in the farm is aware of each other server and where the current VMs are running. Might be easier to manage HA Zones without VCS it sounds like, but I would try it with VCS first with a custom monitor.
     
  8. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    man. one of these days i'm going to have to learn perl. i'm trying to see what exactly VCS is doing to start, stop and monitor these zones and it is all perl scripts. for some reason, the cluster can start the zone, though with some goofy error saying the resource was already started. then, the monitor starts reporting some goofy errors while it is up, though it properly recognizes that it is up and if i take it down manually the monitor recognizes that it was taken down outside of cluster control. the one thing that isn't working properly is the stop. it says it calls for a stop, waits for a while doing nothing that i can see and then does a clean where it forcibly takes it down. i want to see exactly what it is trying to do when it does the stop.

    manually booting and halting the zone works fine.
     
  9. crontab

    crontab (uid = 0)

    Joined:
    Nov 14, 2000
    Messages:
    23,439
    Likes Received:
    11
    I'm interested to see what comes out of this. We just use zones with no HA. I would like to deploy Oracle RAC within zones and wonder how the db and apps will stay up during a failover from zone to zone.
     
  10. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    I am amazed you were able to make it 11 years as a Unix admin without learning perl! I think I learned it on my 2nd year or so...been writing it for 9 years now, extremely fluent, and have dozens of multi-thousand line scripts under my belt, and probably thousands of scripts <1000 lines. Its an extremely useful skill in the admin world and I almost always prefer writing perl to writing shell scripts unless its < 10 lines.
     
  11. crontab

    crontab (uid = 0)

    Joined:
    Nov 14, 2000
    Messages:
    23,439
    Likes Received:
    11
    I don't know perl, but would like to learn as well. Only a few one liners that I got from some guru's in the past. I managed to survive with shell/sed/awk.
     
  12. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    IMO the easy way to learn perl is to just get the o'reilly perl book and skim through it while writing some practice programs - thats how I did it. Its definitely possible to just use google and perldoc but in this case I think its more efficient to have the book.

    #perl on irc.freenode.net is helpful but only if you have a very specific question and in a lot of cases you have to deal with people telling you you're wrong, stupid, etc (like most open source irc channels).
     
  13. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    Wirelessly posted via wap.offtopic.com (Mozilla/2.0 (compatible; MSIE 3.02; Windows CE; PPC; 240x320) BlackBerry8703e/4.1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/105)

    It helped that I'm good friends with a few perl gurus. The problem is that I no longer work with them so my crutch isn't as easily accessible. But, all of my stuff is done with sed, awk and grep. Shell scripting is sort of brute force at times, but it works. For tricky stuff I outsource with beer as payment.
     
  14. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    what are you using the zones for right now? what do you think so far? i'm not sold on them yet, but i'm feeling more confident in them than in LDOMs.
     
  15. crontab

    crontab (uid = 0)

    Joined:
    Nov 14, 2000
    Messages:
    23,439
    Likes Received:
    11
    Hmm, for consolidation really. The apps/dbs that live within the zones do not have to be up five nines. But there's no need to buy three T2000's for dev/qa/prod, when it can all fit into one machine or one that is currently in use for something else.

    We have a few other zoned T2000's for dev/qa zones.

    We haven't tried LDOM's yet because our TAM says it's not ready.
     
  16. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    your technical account manager from Sun said it isn't ready? if so, can you have your guy call my management to help drive that point home? :)
     
  17. crontab

    crontab (uid = 0)

    Joined:
    Nov 14, 2000
    Messages:
    23,439
    Likes Received:
    11
    Yup. Why, is he trying to sell it to you guys really hard?

    We don't need to be bleeding edge or have the latest and greatest. We just need to be stable and solid, our TAM understands that fortunately.
     
  18. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    before i started, a few people here decided that LDOMs were the way to go. now that i started, i'm talking to them trying to convince them that LDOMs are a bad idea for production since it directly affects $$$ and doesn't have any track record to prove it is any good yet. so far i have them accepting zones for production and LDOMs for development, but i'm still unsure if this decision will change.
     
  19. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    Just an update. I've got my cluster config finished. I have the request in for storage, so once it is approved I'll carve up a lun and test mounting a filesystem on a running zone through VCS. The mount resource does have a container parameter, so hopefully it works.

    Just a quick recap, I have a zone that I created on one host, then copied over to a second. I created the zone without any network information, so by default it is not accessible over the network. Then with VCS I have it start the zone and apply an IP. The IP resource has a container parameter so it knows it is bring up an IP in a local zone and not on the global zone itself.
     
  20. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    Ok, latest update:
    - i assigned a 5Gb lun shared between both hosts, created a disk group, created a 1Gb volume and a vxfs filesystem.
    - creating the mount resource i had to set it up to mount to /local/zones/zone1/root/app (/local/zones/zone1 is just the path to the zone). this makes it available as /app within the zone.
    - you do not set that container parameter for the mount resource.
    - i have disk group and zone starting togther, then mount depending on both of them, then ip depending on the mount. all are enabled and critical.

    my tests so far that have worked are:
    - bring everything up, manually halt zone from UNIX. it fails over properly to other side.
    - manually umount /app filesystem from global zone. VCS recognizes it is gone and fails over.

    what else would you recommend as tests?
     
  21. crontab

    crontab (uid = 0)

    Joined:
    Nov 14, 2000
    Messages:
    23,439
    Likes Received:
    11
    What it you lose i/o to the lun. if this lun wasn't multipathed and you lose a transceiver or glass.
     
  22. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    Veritas handles that automatically. Its built into the storage foundation functionality with the disk group - if the system loses access to disk(s) in the disk group it will fault the servicegroup and fail it over.
     
  23. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    actually, on these systems we're using PowerPath since the storage is EMC. it should be pretty transparent to VxVM and VCS.
     
  24. Mike99TA

    Mike99TA I don't have anything clever to put here right now

    Joined:
    Oct 3, 2001
    Messages:
    4,553
    Likes Received:
    0
    Location:
    Greenville, SC
    I was speaking more on the lines of if you have total fibre failure on the server (not just a single path failure) ie: HBA driver dies, or something of the sort - assuming you're using a Veritas Disk Group it will fail over to the other node.

    of course if you're multipathing and you only lose one path (whether you're using special multipath software or just VxDMP) you shouldn't have any issues at all.
     
  25. trouphaz

    trouphaz New Member

    Joined:
    Sep 22, 2003
    Messages:
    2,666
    Likes Received:
    0
    oh, yeah. that's what crontab was asking anyway. i was hoping i could get it to deport the disk group while it was online, but as long as any volumes are up it won't work. since these machines are being used for some other work, i can't go down and yank the cables.
    i'm getting a pair of t5220s soon though, so i can test it there.
     

Share This Page