DiskSuite uses metadb DB replicas to hold all the critical information needed to access your data stored on metadevices. These replicas are essential in many cases to maintain your data integrity, and no discussion of Disksuite would be complete without some discussion on the care and maintenance of thes DB replicas.
Disksuite requires space on the disk to store its metadb database replicas. Because this database contains the critical information needed for you to access the disks, it must be replicated as widely as possible. You should in general spread the replicas out even over as many disks as are available, and where possible they should be evenly distributed over multiple controllers as well (although this is often not feasible). On a system with only 2 internal disks, the replicas will most likely be limitted to those 2 disks, and should be divided equally between them (this way the system can stay up if either disk fails). We also have some systems with 4 internal drives, and in these cases we replicate the databases across all 4 drives, even if only two of them are to be mirrored.
The database replicas take up room on the disk which cannot be used for filesystems, etc. The typical partition scheme for a Glued Solaris box is as follows:
/usr/afson AFS servers, maybe DB replicas
As can be seen, the standard glue set up uses most of the slices available. Slice 2 might be usable, but I would recommend against it especially on a system disk. That leaves slices 6 and 7 free. Physics generally puts the DB replica on one of these 2 slices. The replicas are not that large, 8192 blocks or about 4MB on the recent versions (Solaris 9), and much smaller (about 1/2MB) on earlier (Solaris 7,8 ) versions, and we usually put 2-3 copies on each disk. (NOTE:it is important to spread the copies out over multiple disks, and have the same number of replicas on each disk.) Since I dislike making slices smaller than 50 or so MB, we usually waste a fair amount of space anyway. The other slice may have additional local space available if the disk is big enough that I cannot justify expending the entire disk on system slices.
However, if you hope to make the system an AFS server (thus using slice 6), and possibly put data on slice 7, you have a problem, as there are no more partitions free to put the DB replicas. Fortunately, there is a way around that, at least if you do the mirroring before making the system an AFS server. Disksuite can share a slice between the DB replicas and a filesystem in some cases:
-loption to metadb.
Because it is unwise to have disksuite manage a
on an AFS server, and since you would want the AFS server software of an
AFS server mirrored also, the best bet is if you can mirror the system before
the AFS server software is installed. Put the DB replicas on slice 6, mirror
/var, swap, and
the AFS cache as normal, then create an empty metadevice on slice 6,
newfs it, and mount it on
Some example configurations from Physics:
/usr/afson slice 6, and three DB replicas on slice 7. Slices 0, 1, 3, 4, 5, and 6 are all mirrored.
/usr/afs, and slice 7 contains the extra space. Slices 0, 1, 3, 4, 5, and 6 are mirrored. Slice 7 may or may not be mirrored (definitely not if used as a vice partition).
/usr/afs. Slices 0, 1, 3, 4, 5, and 6 on the system disks are mirrored. Slices 0-6 on the other two disks are available, and may or may not be mirrored (definitely not if using as a vice partition).
Regardless of whether you want to do logging, mirroring, striping, or RAID, you need to create the metadb DB replicas for Disksuite. Because this step is so universal, it is being covered in its own section.
Before creating the DB replicas, you should have:
command to do this. If you are mirroring the disks, you want them to have the same partition structure anyway, so once the first disk is set up, you can use the command
prtvtoc /dev/rdsk/DISK1 | fmthard -s - /dev/rdsk/DISK2
We are now ready to create the state meta-databases. First, make sure
no one configured disksuite without your knowledge by checking for the
existence of DB replicas with the command
Solaris 2.7 users may have to give a full path to the
/usr/opt/SUNWmd/sbin/metadb. On Solaris 8 and 9,
it is in
/usr/sbin which should be in your path. This should
return an error complaining that "there are no existing databases". It
might also just return nothing (usually indicating that DB replicas were
set up once and then all were deleted).
If you get a list of replicas, STOP. Someone set up or tried to set up disksuite before you, and figure out what the status is before proceding further. Using the command below to try to create another initial database set will hopefully yield an error, but if not could be disastrous, wiping out the previous DB and making the previously mirrored, striped, etc. disks inaccessible.
For a two disk system, Sun advises a minimum of 2 replicas per disk; physics uses 3. To create the initial replicas, issue the command (as root):
metadb -a -f -c 3 slice
/dev/dsk/c0t0d0s7to put it on slice 7 of the 1st disk. The
-c 3in the above command instructs it to put three copies of the DB there. The
-asays we are adding replicas, and the
-fforces the addition. NOTE: the
-foption should only be used for the initial DB replica, when it is REQUIRED to avoid errors due to lack of any previously existing replicas.
NOTE: if you a replacing a replica set on a partition that
has a file system on it, be aware of the change in the default replica size
between Solaris 7,8 and Solaris 9. You may need to use the
option on metadb to limit the size of the new replicas so as not to overwrite
the beginning of the filesystem, or do some nasty recreation of the filesystem
to a smaller size.
You can check the databases were created successfully by issuing the
metadb command without arguments, or with just the
argument to get a legend for the flags. You should now see 3 (or whatever
value you gave to the
-c argument) DB replicas in successive blocks
on the slice you specified. At this point, only the
u (up-to-date) flags should be set.
Now add the replicas on the second (and any other) drives. This is done with a command like:
metadb -a -c 3 /dev/dsk/c0t1d0s7
-doption to delete all replicas on the named partition, and then re-add the correct number.
You can use the plain
metadb command (or give it the
-i option for the flags legend) to verify the databases are
functioning properly. This should be used right after creation to ensure
they were created successfully, and is also useful to use later to verify
things are OK.
You should again see a line for each replica on each disk, along with some flags indicating the status of each replica. In general, lower case flags are good, upper case flags are bad. The following flags seem to be set on a functioning system (flags should appear for every replica unless otherwise stated):
a: the replica is active. This should always be set.
m: flagging the master replica (only one replica should have this set, usually the first)
p: the replica is patched into the kernel. This should get set after the first reboot (why? what does it mean?)
l: the replica was read successfully. This should get set after the first reboot?
u: the replica is up-to-date. This should always be set.
o: the replica was active prior to the last database change. This should get set after the first reboot.