Saturday, May 11, 2013

UFS File System Troubleshooting

Filesystem corruption can be detected and often repaired by the format and fsck commands. If the filesystem corruption is not due to an improper system shutdown, the hard drive hardware may need to be replaced.

ufs filesystems contain the following types of blocks:

  • boot block: This stores information used to boot the system.
  • superblock: Much of the filesystems internal information is stored in these.
  • inode: Stores location information about a file--everything except for the file name. The number of inodes in a filesystem can be changed from the default if newfs -i is used to create the filesystem.
  • data block: The file's data is stored in these.

fsck

The fsck command is run on each filesystem at boot time. This utility checks the internal consistency of the filesystem, and can make simple repairs on its own. More complex repairs require feedback from the root user, either in terms of a "y" keyboard response to queries, or invocation with the -y option.

If fsck cannot determine where a file belongs, the file may be renamed to its inode number and placed in the filesystem's lost+found directory. If a file is missing after a noisy fsck session, it may still be intact in the lost+found directory.

Sometimes the fsck command complains that it cannot find the superblock. Alternative superblock locations were created by newfs at the time that the filesystem was created. The newfs -N command can be invoked to nondestructively discover the superblock locations for the filesystem.

ufs filesystems can carry "state flags" that have the value of fsclean, fsstable, fsactive or fsbad (unknown). These can be used by fsck during boot time to skip past filesystems that are believed to be okay.

format

The analyze option of format can be used to examine the hard drive for flaws in a nondestructive fashion.

df

df can be used to check a filesystem's available space. Of particular interest is df -kl, which checks available space for all local filesystems and prints out the statistics in kilobytes. Solaris 10 also allows us to use df -h, which presents the statistics in a more human-friendly form that doesn't require counting digits to decide whether a file is 100M or 1G in size.

du

du can be used to check space used by a directory. In particular, du -dsk will report useage in kilobytes of a directory and its descendants, without including space totals from other filesystems.

Filesystem Tuning

Filesystem performance can be improved by looking at filesystem caching issues.

The following tuning parameters may be valuable in tuning filesystem performance with tunefs or mkfs/newfs:

  • inode count: The default is based upon an assumption of average file sizes of 2 KB. This can be set with mkfs/newfs at the time of filesystem creation.
  • time/space optimization: Optimization can be set to allow for fastest performance or most efficient space useage.
  • minfree: In Solaris 2.6+, this is set to (64 MB / filesystem size) x 100. Filesystems in earlier OS versions reserved 10%. This parameter specifies how much space is to be left empty in order to preserve filesystem performance.
  • maxbpg: This is the maximum number of blocks a file can leave in a single cylinder group. Increasing this limit can improve large file performance, but may have a negative impact on small file performance.

Filesystem Performance Monitoring

McDougall, Mauro and Gregg suggest that the best way to see if I/O is a problem at all is to look at the amount of time spent on POSIX read() and write() system calls via DTrace. If so, we need to look at the raw disk I/O performance.

No comments: