Wednesday, June 19, 2013

Veritas Volume Manager Notes

Veritas has long since been purchased by Symantec, but its products continue to be sold under the Veritas name. Over time, we can expect that some of the products will have name changes to reflect the new ownership.

Veritas produces volume and file system software that allows for extremely flexible and straightforward management of a system's disk storage resources. Now that ZFS is providing much of this same functionality from inside the OS, it will be interesting to see how well Veritas is able to hold on to its installed base.

In Veritas Volume Manager (VxVM) terminology, physical disks are assigned a diskname and imported into collections known as disk groups. Physical disks are divided into a potentially large number of arbitrarily sized, contiguous chunks of disk space known as subdisks. These subdisks are combined into volumes, which are presented to the operating system in the same way as a slice of a physical disk is.

Volumes can be striped, mirrored or RAID-5'ed. Mirrored volumes are made up of equally-sized collections of subdisks known as plexes. Each plex is a mirror copy of the data in the volume. The Veritas File System (VxFS) is an extent-based file system with advanced logging, snapshotting, and performance features.

VxVM provides dynamic multipathing (DMP) support, which means that it takes care of path redundancy where it is available. If new paths or disk devices are added, one of the steps to be taken is to run vxdctl enable to scan the devices, update the VxVM device list, and update the DMP database. In cases where we need to override DMP support (usually in favor of an alternate multipathing software like EMC Powerpath), we can run vxddladm addforeign.

Here are some procedures to carry out several common VxVM operations. VxVM has a Java-based GUI interface as well, but I always find it easiest to use the command line.

Standard VxVM Operations

OperationProcedure
Create a volume: (length specified in sectors, KB, MB or GB) vxassist -g dg-name make vol-name length(skmg)
Create a striped volume (add options for a stripe layout): layout=stripe diskname1 diskname2 ...
Remove a volume (after unmounting and removing from vfstab): vxstop vol-name
then
vxassist -g dg-name remove volume vol-name
or
vxedit -rf rm vol-name
Create a VxFS file system: mkfs -F vxfs -o largefiles /dev/vx/rdsk/dg-name/vol-name
Snapshot a VxFS file system to an empty volume:mount -F vxfs -o snapof=orig-vol empty-vol mount-point
Display disk group free space:vxdg -g dg-name free
Display the maximum size volume that can be created: vxassist -g dg-name maxsize [attributes]
List physical disks: vxdisk list
Print VxVM configuration: vxprint -ht
Add a disk to VxVM: vxdiskadm (follow menu prompts)
or
vxdiskadd disk-name
Bring newly attached disks under VxVM control (it may be necessary to use format or fmthard to label the disk before the vxdiskconfig): drvconfig; disks
vxdiskconfig
vxdctl enable
Scan devices, update VxVM device list, reconfigure DMP: vxdctl enable
Scan devices on OS device tree, initiate dynamic reconfig of multipathed disks. vxdisk scandisks
Reset a disabled vxconfigd daemon: vxconfigd -kr reset
Manage hot spares: vxdiskadm (follow menu options and prompts)
vxedit set spare=[off|on] vxvm-disk-name
Rename disks: vxedit rename old-disk-name new-disk-name
Rename subdisks: vxsd mv old-subdisk-name new-subdisk-name
Monitor volume performance: vxstat
Re-size a volume (but not the file system):vxassist growto|growby|shrinkto|shrinkby volume-name length[s|m|k|g]
Resize a volume, including the file system: vxresize -F vxfs volume-name new-size[s|m|k|g]
Change a volume's layout: vxassist relayout volume-name layout=layout

The progress of many VxVM tasks can be tracked by setting the -t flag at the time the command is run: utility -t taskflag. If the task flag is set, we can use vxtask to list, monitor, pause, resume, abort or set the task labeled by the tasktag.

Physical disks which are added to VxVM control can either be initialized (made into a native VxVM disk) or encapsulated (disk slice/partition structure is preserved). In general, disks should only be encapsulated if there is data on the slices that needs to be preserved, or if it is the boot disk. (Boot disks must be encapsulated.) Even if there is data currently on a non-boot disk, it is best to back up the data, initialize the disk, create the file systems, and restore the data.

When a disk is initialized, the VxVM-specific information is placed in a reserved location on the disk known as a private region. The public region is the portion of the disk where the data will reside.

VxVM disks can be added as one of several different categories of disks:

  • sliced: Public and private regions are on separate physical partitions. (Usually s3 is the private region and s4 is the public region, but encapsulated boot disks are the reverse.)
  • simple: Public and private regions are on the same disk area.
  • cdsdisk: (Cross-Platform Data Sharing) This is the default, and allows disks to be shared across OS platforms. This type is not suitable for boot, swap or root disks.

If there is a VxFS license for the system, as many file systems as possible should be created as VxFS file systems to take advantage of VxFS's logging, performance and reliability features.

At the time of this writing, ZFS is not an appropriate file system for use on top of VxVM volumes. Sun warns that running ZFS on VxVM volumes can cause severe performance penalties, and that it is possible that ZFS mirrors and RAID sets would be laid out in a way that compromises reliability.

VxVM Maintenance

The first step in any VxVM maintenance session is to run vxprint -ht to check the state of the devices and configurations for all VxVM objects. (A specific volume can be specified with vxprint -ht volume-name.) This section includes a list of procedures for dealing with some of the most common problems. (Depending on the naming scheme of a VxVM installation, many of the below commands may require a -g dg-name option to specify the disk group.)

  • Volumes which are not starting up properly will be listed as DISABLED or DETACHED. A volume recovery can be attempted with the vxrecover -s volume-name command.
  • If all plexes of a mirror volume are listed as STALE, place the volume in maintenance mode, view the plexes and decide which plex to use for the recovery:
    vxvol maint volume-name (The volume state will be DETACHED.)
    vxprint -ht volume-name
    vxinfo volume-name (Display additional information about unstartable plexes.)
    vxmend off plex-name (Offline bad plexes.)
    vxmend on plex-name (Online a plex as STALE rather than DISABLED.)
    vxvol start volume-name (Revive stale plexes.)
    vxplex att volume-name plex-name (Recover a stale plex.)
  • If, after the above procedure, the volume still is not started, we can force a plex to a “clean” state. If the plex is in a RECOVER state and the volume will not start, use a -f option on the vxvol command:
    vxmend fix clean plex-name
    vxvol start volume-name
    vxplex att volume-name plex-name
  • If a subdisk status is listing as NDEV even when the disk is listed as available with vxdisk list the problem can sometimes be resolved by running
    vxdg deport dgname; vxdg import dgname
    to re-initialize the disk group.
  • To remove a disk:
    Copy the data elsewhere if possible.
    Unmount file systems from the disk or unmirror plexes that use the disk.
    vxvol stop volume-name (Stop volumes on the disk.)
    vxdg -g dg-name rmdisk disk-name (Remove disk from its disk group.)
    vxdisk offline disk-name (Offline the disk.)
    vxdiskunsetup c#t#d# (Remove the disk from VxVM control.)
  • To replace a failed disk other than the boot disk:
    In vxdiskadm, choose option 4: Remove a disk for replacement. When prompted, chose “none” for the disk to replace it.
    Physically remove and replace the disk. (A reboot may be necessary if the disk is not hot-swappable.) In the case of a fibre channel disk, it may be necessary to remove the /dev/dsk and /dev/rdsk links and rebuild them with
    drvconfig; disks
    or a reconfiguration reboot.
    In vxdiskadm, choose option 5: Replace a failed or removed disk. Follow the prompts and replace the disk with the appropriate disk.
  • To replace a failed boot disk:
    Use the eeprom command at the root prompt or the printenv command at the ok> prompt to make sure that the nvram=devalias and boot-device parameters are set to allow a boot from the mirror of the boot disk. If the boot paths are not set up properly for both mirrors of the boot disk, it may be necessary to move the mirror disk physically to the boot disk's location. Alternatively, the devalias command at the ok> prompt can set the mirror disk path correctly, then use nvstore to write the change to the nvram. (It is sometimes necessary to nvunalias aliasname to remove an alias from the nvramrc, then
    nvalias aliasname devicepath
    to set the new alias, then
    nvstore
    to write the changes to nvram.)
    In short, set up the system so that it will boot from the boot disk's mirror.
    Repeat the steps above to replace the failed disk.
  • Clearing a "Failing" Flag from a Disk:
    First make sure that there really is not a hardware problem, or that the problem has been resolved. Then,
    vxedit set failing=off disk-name
  • Clearing an IOFAIL state from a Plex:
    First make sure that the hardware problem with the plex has been resolved. Then,
    vxmend -g dgname -o force off plexname
    vxmend -g dgname on plexname
    vxmend -g dgname fix clean plexname
    vxrecover -s volname

VxVM Resetting Plex State

soltest/etc/vx > vxprint -ht vol53
Disk group: testdg
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE
SC NAME PLEX CACHE DISKOFFS LENGTH [COL/]OFF DEVICE MODE
DC NAME PARENTVOL LOGVOL
SP NAME SNAPVOL DCO
EX NAME ASSOC VC PERMS MODE STATE
SR NAME KSTATE
v vol53 - DISABLED ACTIVE 20971520 SELECT - fsgen
pl vol53-01 vol53 DISABLED IOFAIL 20971520 CONCAT - RW
sd disk141-21 vol53-01 disk141 423624704 20971520 0 EMC0_2 ENA
soltest/etc/vx > vxmend -g testdg -o force off vol53-01
soltest/etc/vx > vxprint -ht vol53
Disk group: testdg
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
v vol53 - DISABLED ACTIVE 20971520 SELECT - fsgen
pl vol53-01 vol53 DISABLED OFFLINE 20971520 CONCAT - RW
sd disk141-21 vol53-01 disk141 423624704 20971520 0 EMC0_2 ENA
soltest/etc/vx > vxmend -g testdg on vol53-01
soltest/etc/vx > vxprint -ht vol53
Disk group: testdg
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
v vol53 - DISABLED ACTIVE 20971520 SELECT - fsgen
pl vol53-01 vol53 DISABLED STALE 20971520 CONCAT - RW
sd disk141-21 vol53-01 disk141 423624704 20971520 0 EMC0_2 ENA
soltest/etc/vx > vxmend -g testdg fix clean vol53-01
soltest/etc/vx > !vxprint
vxprint -ht vol53
Disk group: testdg
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
v vol53 - DISABLED ACTIVE 20971520 SELECT - fsgen
pl vol53-01 vol53 DISABLED CLEAN 20971520 CONCAT - RW
sd disk141-21 vol53-01 disk141 423624704 20971520 0 EMC0_2 ENA
soltest/etc/vx > vxrecover -s vol53
soltest/etc/vx > !vxprint
vxprint -ht vol53
Disk group: testdg
V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE
PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE
SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE
v vol53 - ENABLED ACTIVE 20971520 SELECT - fsgen
pl vol53-01 vol53 ENABLED ACTIVE 20971520 CONCAT - RW
sd disk141-21 vol53-01 disk141 423624704 20971520 0 EMC0_2 ENA

VxVM Mirroring

Most volume manager availability configuration is centered around mirroring. While RAID-5 is a possible option, it is infrequently used due to the parity calculation overhead and the relatively low cost of hardware-based RAID-5 devices.

In particular, the boot device must be mirrored; it cannot be part of a RAID-5 configuration. To mirror the boot disk:

  • eeprom use-nvramrc?=true
    Before mirroring the boot disk, set use-nvramrc? to true in the EEPROM settings. If you forget, you will have to go in and manually set up the boot path for your boot mirror disk. (See “To replace a failed boot disk” in the “VxVM Maintenance” section for the procedure.) It is much easier if you set the parameter properly before mirroring the disk!
  • The boot disk must be encapsulated, preferably in the bootdg disk group. (The bootdg disk group membership used to be required for the boot disk. It is still a standard, and there is no real reason to violate it.)
  • If possible, the boot mirror should be cylinder-aligned with the boot disk. (This means that the partition layout should be the same as that for the boot disk.) It is preferred that 1-2MB of unpartitioned space be left at either the very beginning or the very end of the cylinder list for the VxVM private region. Ideally, slices 3 and 4 should be left unconfigured for VxVM's use as its public and private region. (If the cylinders are aligned, it will make OS and VxVM upgrades easier in the future.)
  • (Before bringing the boot mirror into the bootdg disk group, I usually run an installboot command on that disk to install the boot block in slice 0. This should no longer be necessary; vxrootmir should take care of this for us. I have run into circumstances in the past where vxrootmir has not set up the boot block properly; Veritas reports that those bugs have long since been fixed.)
  • Mirrors of the root disk must be configured with "sliced" format and should live in the bootdg disk group. They cannot be configured with cdsdisk format. If necessary, remove the disk and re-add it in vxdiskadm.
  • In vxdiskadm, choose option 6: Mirror Volumes on a Disk. Follow the prompts from the utility. It will call vxrootmir under the covers to take care of the boot disk setup portion of the operation.
  • When the process is done, attempt to boot from the boot mirror. (Check the EEPROM devalias settings to see which device alias has been assigned to the boot mirror, and run boot device-alias from the ok> prompt.

Procedure to create a Mirrored-Stripe Volume: (A mirrored-stripe volume mirrors several striped plexes—it is better to set up a Striped-Mirror Volume.)

  • vxassist -g dg-name make volume length layout=mirror-stripe Creating a Striped-Mirror Volume: (Striped-mirror volumes are layered volumes which stripes across underlaying mirror volumes.)
  • vxassist -g dg-name make volume length layout=stripe-mirror

Removing a plex from a mirror:

  • vxplex -g dg-name -o rm dis plex-name Removing a mirror from a volume:
  • vxassist -g dg-name remove mirror volume-name

Removing a mirror and all associated subdisks:

  • vxplex -o rm dis volume-name

Dissociating a plex from a mirror (to provide a snapshot):

  • vxplex dis volume-name
  • vxmake -U gen vol new-volume-name plex=plex-name (Creating a new volume with a dissociated plex.)
  • vxvol start new-volume-name
  • vxvol stop new-volume-name (To re-associate this plex with the old volume.)
  • vxplex dis plex-name
  • vxplex att old-volume-name plex-name
  • vxedit rm new-volume-name

Removing a Root Disk Mirror:

  • vxplex -o rm dis rootvol-02 swapvol-02 [other root disk volumes]
  • /etc/vx/bin/vxunroot

No comments: