Monday, April 15, 2013

Solaris 10 Resource Management

Solaris 10 resource mangement is a major step forward over what was available in Solaris 8 and 9. In Solaris 10, we can manage resources at a zone, project or task level. This page focuses mainly on project-level resource management. Additional information is available in Sun's System Administration Guide: Solaris Containers-Resource Management and Solaris Zones on the Sun Documentation Web Site.

Projects

Projects are collections of tasks, which are collections of processes. A new task is started in a project when a new session is opened by a login, cron, newtask, setproject or su command. Each process belongs to only one task, and each task belongs to only one project.

The default project for a user is determined as per the getdefaultproj() man page.

When there is more than one policy in place for a particular object, the smallest container's control is enforced first.

Projects are maintained via the /etc/project file. Changes to /etc/project become available for new tasks in a project. (prctl and rctladm are used to perform runtime changes.)

The fields in an /etc/project entry are:

  • projname: Name of the project.
  • projid: Unique numerical project identifier less than UID_MAX (2147483647).
  • comment: Project description.
  • user-list: Comma-separated list of users.
  • group-list: Comma-separated list of groups.
  • attributes: Semicolon-separated list of name-value pairs, such as resource controls, in a name[=value] format.

After a default Solaris 10 installation, /etc/project contains the following:

system:0::::(default project for system processes and daemons)
user.root:1::::(processes owned by the root user)
noproject:2::::(IP Quality of Service)
default:3::::(default assigned to every otherwise unassigned user)
group.staff:10::::(default used for unassigned users in the "staff" group)

Parameters are set by adding them to the last field of the project entry:
projectname:101::::project.max-lwps=(privileged,200,deny)

Management Commands

Commands for managing project attributes include the following:

  • projects: Displays project memberships for users, lists projects from the project database, prings information on given projects.
  • newtask: Executes the shell or command in a new task in the current project.
  • projadd: Adds a new entry to the /etc/project entry.
  • projmod: Modifies information for a project in /etc/project.
  • projdel: Deletes a project from /etc/project.
  • rctladm: Displays/modifies global state of active resource controls, sets logging or actions.
  • prctl: Displays/modifies local resource controls.
  • ipcs: Identifies which IPC objects are being used in a project.
  • rcapadm: Manages rcapd memory-capping daemon.
  • prstat -J: Displays resource consumption on a per-project basis
  • priocntl -i project-name: Sets/displays scheduling parameters of the project.
  • poolbind -i project-name: Assigns a project to a resource pool.

Usage examples are provided at the end of this page.

Privilege Levels

Each resource control threshhold needs to be associated with one of the following privilege levels:

  • basic: Can be modified by owner of calling process.
  • privileged: Only modifiable by superuser
  • system: Fixed for the duration of the operating system instance

IPC Resource Controls

The Solaris 10 IPC resource management framework fixes some serious problems in the older SVR4-based system. Some parameters were converted to be set dynamically, some defaults were increased, some parameters were retired, and the names of the surviving parameters were changed to be more human-readable.

In older Solaris versions, the resource limits were system-wide (causing potential conflicts) and reboots were required for even minor changes.

The Solaris 10 system permits project-based resource controls and allows controls to be monitored and changed via prctl.

Additional information about IPC resource management can be found on the IPC Issues page.

For the purposes of IPC resource management, the following are the important parameters:

  • project.max-shm-ids: Maximum shared memory IDs for a project. Replaces shmmni
  • project.max-sem-ids: Maximum semaphore IDs for a project. Replaces semmni
  • project.max-msg-ids: Maximum message queue IDs for a project. Replaces msgmni
  • project.max-shm-memory: Total amount of shared memory allowed for a project. Replaces shmmax
  • process.max-sem-nsems: Maximum number of semaphores allowed per semaphore set. Replaces semmsl
  • process.max-sem-ops: Maximum number of semaphore operations allowed per semop. Replaces semopm
  • process.max-msg-messages: Maximum number of messages on a message queue. Replaces msgtql
  • process.max-msg-qbytes: Maximum number of bytes of messages on a message queue. Replaces msgmnb

An Oracle-specific example is provided below.

Other Resource Controls

The new Solaris 10 resource controls include compatibility interfaces to the old rlimit-style resource controls. Existing applications using the old interfaces can continue to run unchanged.

Additional Resource Controls:

  • [zone|project].cpu-shares: Maximum CPU shares allowed (under Fair Share Scheduler)
  • [task|process].max-cpu-time: Maximum CPU time available to processes in this task.
  • project.max-contracts: Maximum number of contracts allowed
  • project.max-crypto-memory: Total kernel memory usable by libpkcsll for hardware crypto accelleration.
  • project.max-device-locked-memory: Total locked memory allowed.
  • process.max-address-space: Maximum address space.
  • process.max-core-size: Maximum core dump size.
  • process.max-data-size: Maximum heap size.
  • process.max-file-descriptor: Maximum file descriptor index.
  • process.max-file-size: Maximum file offset allowed for writes.
  • process.max-stack-size: Maximum stack memory segment available.
  • [zone|project|task].max-lwps: Maximum lwps available to this project.
  • process.max-port-events: Maximum events per port.
  • project.max-port-ids: Maximum allowable event ports.
  • project.max-tasks: Maximum allowable tasks.
  • rcap.max-rss: Maximum physical memory consumption by processes in project.
A full list of resources is available on the resource_controls man page.

Resources beginning with the rcap string are associated with the rcapd resource-capping daemon.

rcapd

rcapd caps memory useage within a project. In each zone, rcapd can be enabled via
rcapadm -E
This command will start rcapd and set it up in SMF so that it will be restarted automatically.

We can use projmod to set the memory cap for a project:
projmod -s -K rcap.max-rss=sizeMB project-name
Alternatively, we can set the rcap.max-rss control directly in /etc/project.

rcapd does not account for shared memory in an intuitive way. To be safe, we need to allow enough room for shared memory to be included under the cap. We should not depend solely on rcapd to manage process memory.

Logging

Global logging can be enabled by setting syslog=level with rctladm, where level is one of the usual syslog levels: debug, info, notice, warning, err, crit, alert or emerge.

Actions

It is possible to use rctladm to specify one of the following actions on a process that violates the control:

  • none: No action taken. (Useful for monitoring.)
  • deny: Denies request.
  • signal=: Enable a signal. See the rctladm man page for a list of allowed signals.

Command Examples

The projadd man page provides an example of how to add a project:

The following command creates the project salesaudit and sets the resource controls specified as arguments to the -K option.

projadd -p 111 -G sales,finance -c "Auditing Project" -K "rcap.max-rss=10GB" -K "process.max-file-size=(priv,50MB,deny)" -K "task.max-lwps=(priv,100,deny)" salesaudit

This command would produce the following entry in /etc/project:
salesaudit:111:Auditing Project::sales,finance:process.max-file-size=(priv,52428800,deny); rcap.max-rss=10737418240;task.max-lwps=(priv,100,deny)

To start up a task under this project, run the following:
newtask -p salesaudit command

A running process can be associated with a new task:
newtask -v -p project-name -c PID

To verify the project governing the current shell, we would run:
id -p

To view resource constraints for a process, we would run something like the following:
prctl -n resource-name -i process PID

To view resource constraints for the current shell, we could run:
prctl $$

To temporarily set resource constraints on a particular project, we could run something like:
prctl -n resource-name -t privilege-level -v value -e action -i project project-name

To activate logging on a global resource control facility, run something like:
rctladm -e syslog=level resource-name

To list all existing projects, run:
projects -l

To see how a project's IPC objects are allocated against existing limits, run something like:
ipcs -J

To display a process's project id, use a command of the form:
ps -o projid -p PID

To match project or task ids for pgrep, pkill or prstat commands, use the -T or -J options:
pgrep -J project-IDs
pkill -T task-IDs
prstat -J

Oracle Setup Example

Oracle 9i recommends several minimum semaphore and shared memory settings. Since Solaris 10 has increased the defaults on several settings above previous levels, and since several other ones have become obsolete, only the shmmax parameter should need to be set.

In particular, the new defaults for some key parameters are:

  • semmni: 128 (100 recommended)
  • semmsl: 512 (256 recommended)
  • shmmni: 128 (100 recommended)

The following are obsolete:

The projmod command can be used to set the shmmax to the desired level (default is 1/4 physical memory):
projmod -sK "project.max-shm-memory=(privileged,gigabytes-sharedGB,deny)" project-name

It makes sense to set up projects (and project limits) for each environment on the server. To ensure that each instance actually starts up in the proper project, the startup scripts will need to include a
newtask -p project-name
line.

A full example of this type is found in Chapter 4 of The Sun BluePrints Guide to Solaris Containers.

Default Project

The default project for a user is determined as per the getdefaultproj() man page:

The getdefaultproj() function first looks up the project key word in the user_attr database used to define user attributes in restricted Solaris environments. If the database is available and the keyword is present, the function looks up the named project, returning NULL if it cannot be found or if the user is not a member of the named project. If absent, the function looks for a match in the project database for the special project user.username. If no match is found, or if the user is excluded from project user.username, the function looks at the default group entry of the passwd database for the user, and looks for a match in the project database for the special name group.groupname, where groupname is the default group associated with the password entry corresponding to the given username. If no match is found, or if the user is excluded from project group.groupname, the function returns NULL. A special project entry called 'default' can be looked up and used as a last resort, unless the user is excluded from project 'default'. On successful lookup, this function returns a pointer to the valid project structure. By convention, the user must have a default project defined on a system to be able to log on to that system.

Additional Reading

Additional information is available in Sun's System Administration Guide: Solaris Containers-Resource Management and Solaris Zones and The Sun BluePrints Guide to Solaris Containers on the Sun Documentation Web Site.

No comments: