Mirroring the Boot disk with Disksuite on Glued Solaris Boxes


Notes on Mirroring the boot disk with DiskSuite on Glued Solaris Systems

Disksuite is a standard Sun software based RAID package available on Glued Solaris boxes. On use of it is to mirror the internal system (boot) disks on systems with dual internal disks. This page details how to do this on Glued Solaris systems, based on experiences with mirroring a pair of internal system disks on Sun Enterprise 220R, SunFire 280R, and SunFire V440 systems, running Solaris 2.7 through 2.9.

The instructions assume we are going to mirror the system/boot disk (e.g. filesystems root (/), /usr, /var, /usr/vice/cache, and swap). It is fairly straight forward to adapt these instructions to mirror other filesystems (either separately or in conjunction with the above), and to deal with systems with more than 2 disks.

The instructions assume that we are enabling mirroring on a newly installed, non-production machine. These restrictions are not strictly required, but obviously enabling mirroring of the system disk on a system already in production runs some risks. The system WILL need to be rebooted at least once when mirroring the system disk; if a non-system disk is being mirrored you can probably get away with just umounting and remounting the partitions being mirrored at the appropriate times. NOTE: all filesystems being mirrored must be mounted using the single-ply metadisk mirror before attaching the second submirror to the mirror metadevice; if new data written to the filesystem will only be written to one submirror, and disksuite will get very upset upon attempt to reboot (as the two submirrors are listed as being in sync but are not); if you do this with root will probably not even be able to get to single-user mode.

  1. Do a "normal" install of Glue onto the first of the two internal disks. The only thing special compared to any other Glue install is that you need to have an idea of where the metadb DB replicas are to go and make sure the partition table has room for them. See this section for more information about locating the DB replicas.
  2. After Glue is installed, the initial rdist done, and the system is up and healthy, create the partition table on the second disk. I am not sure if this is strictly required, but if not it is strongly advised that all slices to be mirrored have the same slice number and be identical in the partition table (same starting and ending cylinder number, tags, flags, etc.). Indeed, since typically you will be mirroring all defined slices on the two disks, it is usually easiest to have identical partition tables, which you can do by copying the table between DISK1 and DISK2:
    prtvtoc /dev/rdsk/DISK1 | fmthard -s - /dev/rdsk/DISK2
  3. Perform any preliminary steps required. In particular, for Solaris 2.7, a number of devices and files need to be created. You may also wish to review some of the references.
  4. Decide on the location of the DB replicas (see here for help), and install the DB replicas. E.g., if you are installing to slice 7 of the two disks c0t0d0 and c0t1d0, you would use
    metadb -f -c3 -a /dev/dsk/c0t0d0s7
    metadb -c3 -a /dev/dsk/c0t1d0s7
  5. Decide on a metadevice naming scheme.
  6. Create the metadevices for the two submirrors and for a single-ply mirror device for each partition to be mirrored. I find this easiest to do by changing the md.tab file (found in /etc/opt/SUNWmd on Solaris 2.7 systems, and /etc/lvm on Solaris 8 and 9 systems); this has the advantage of allowing comments and a more lasting record of what was done (especially if save a copy in the glue config tree).
  7. Issue the mount command and make sure all filesystems to be mirrored are listed as mounted under the metadisk mirror device, not the physical device. Do not proceed any further unless this is the case, as attaching the second submirror while the physical device for the first submirror is mounted will result in metadisk syncing the disks, but any writes to the file system go to one disk only, thereby breaking the synchronization. What is worse, is that metadisk thinks they are synchronized, and will fail hard. You might not even be able to get into single user on next reboot. Make sure all filesystems to be mirrored are mounted with the metadisk mirror device before attaching the other submirror. This generally means a reboot after editting vfstab as indicated in previous step.
  8. At this point, the filesystems to be mirrored are using the mirrored metadevices, but we are not really mirroring yet because the mirrors all consist of only a single submirror. We can now attach the second submirror to each of the mirrors with a command like:
    metattach MIRROR SUBMIRROR2
    This causes SUBMIRROR2 to be attached to MIRROR, and will cause the newly attached submirror to be synchronized with the previous submirror, copying information from the old submirror to the new one. For the /var filesystem in our example, metattach d6 d61. Repeat this for all filesystems to mirror. Because the synchronization can take a while and keep the disks quite busy, you may want to allow each slice to finish synchronizing before starting the next on a production system (see the metastat command below). On non-production systems I usually get let them all sync in parallel.
  9. You can run the metastat to see how things are going. All the mirrors you recently attached should be syncing up. This can take some time. The new submirrors should now show up as "Submirrors" and not as "Concat/Stripes". States for the newly attached submirrors will be "Resyncing" but should eventually change to OK when synchronized.
  10. If the swap device was mirrored, you will need to change the crash dump device to the swap metadevice. This can be done with the command
    dumpadm -d `swap -l | tail -1 | awk '{print $1}'`
    It should report the new dump device as the swap mirror metadevice, e.g. /dev/md/desk/d2. You can verify this at a later point by issuing the command dumpadm without any arguments. Failure to do this may cause serious problems if a crash dump ever gets written, and may make system unbootable
  11. If you are mirroring the boot disk (e.g. root (/), you need to make the mirror disk bootable. On SPARC systems, this can be done with the command
    installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/ROOT2
    where ROOT2 is the root slice of the mirrored disk. E.g. c0t1d0s0.
  12. If you are mirroring the boot disk you will probably want to create an alias for the alternate boot disk. Openboot does not know about the software mirrorings, and will only boot from physical disks. Normally it boots from the first disk, but it is helpful to define an alias for the mirrored disk so you can easily boot off of that disk if the primary disk fails. This can be done either from the shell or from openboot (as was mentioned when we suggested that you might want to halt instead of reboot after modifying the vfstab file.) If you did not do it then, you can do it now, as follows:
    1. Determine the physical device path of the mirror disk by examining the symbolic link for the device under /dev/dsk and extracting everything following the /devices part. E.g., if the mirrored root disk is c0t1d0s0, issue the command ls -l /dev/dsk/c0t1d0s0. This should return something like
      benfranklin:~# ls -l /dev/dsk/c1t1d0s0
      lrwxrwxrwx 1 root root 70 May 4 2004 /dev/dsk/c1t1d0s0 -> ../../devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8a6403,0:a
      so what you want is pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8a6403,0:a.
    2. If you are doing this from a system that is up in unix, first check for any previously existing device aliases with the command eeprom nvramrc. If it returns something like data not available, you can just go ahead. Otherwise you need to determine if the previous definitions should be kept or not. If you want to wipe them out, just proceed as if there were no previous definitions, otherwise you need to cut and paste the previous definitions into the next command.
    3. Create a device alias with an easily remembered name (e.g. "mirror") for this disk, and then issue the following commands from the unix prompt
      eeprom "nvramrc=OLDDEFS devalias mirror PHYSPATH
      where OLDDEFS are any old definitions in the NVRAM that you wish to keep, and PHYSPATH is the physical path to the secondary boot disk. For the above example, assuming no previous definitions exist (or want to keep), you would have:
      eeprom "nvramrc=devalias mirror /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8a6403,0:a"
      You can, of course choose another name than "mirror" for the alias if desired. Alternatively, you can do this with the corresponding command at the openboot prompt, e.g.
      nvalias mirror /pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w21000004cf8a6403,0:a
      (Actually, you can save some typing by running the show-disks command, and selecting the device you want, and then typing ^Y (control-Y) instead of the long device name in the nvalias command).
    4. Make sure the NVRAM aliases are read at boot up. From Unix, use the command
      eeprom "use-nvram?=true"
      or from openboot prompt the command
      setenv use-nvramrc? true
    5. You can verify the settings with the commands
      eeprom "nvramrc"
      from Unix of
      devalias
      from openboot prompt.
  13. If you are mirroring the boot disk you should consider configuring the system to ignore the standard behavior and boot anyway even if only half of the DB replicas are available. I believe the default behavior (to only allow booting if more than half of the DB replicas are present) is to prevent some pathological cases wherein one removes one of a mirrored pair of drives, boots the system, makes some changes, halts system, puts back the removed drive and removes the old drive, boots system, and makes some other changes to the filesystem. When booted then with both drives, the system will have 2 disks with different contents each claiming to hold the "true" data. It should be safe to disable that safeguard as long as you don't engage in the wierd behavior that leads to these pathological behaviors. You can disable it by adding the line
    set md:mirrored_root_flag=1
    in the file /etc/system. If you are primarily interested in a system that stays up as much as possible, and will reboot successfully if it goes down after a disk failure, you want this option set.


Main Physics Dept site Main UMD site


Valid HTML 4.01! Valid CSS!